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PREFACE 


Theoretical physics, though most intimately connected with exper¬ 
imental physics, differs substantially from the latter both in methods 
and results. 

Experiments establish individual facts, some of them of outstand¬ 
ing, fundamental significance. Theory explains them, but it also 
formulates general principles. If the task of theoretical physics were 
merely to analyse the results of experiments, it would be no more 
than the experimenter’s aid. 

Such theoretical achievements as the enunciation of mechanics, 
electrodynamics, statistical and quantum mechanics go far beyond 
simply interpreting a number of separate experiments. Theoretical 
physics has in many ways contributed to the moulding of mankind’s 
scientific world outlook as a whole, it has influenced the thinking 
of people far removed from the natural sciences. 

Knowledge of theoretical physics requires knowledge of experi¬ 
mental physics, it requires an understanding of the relationships 
between physical phenomena and general laws; in short, everything 
one could call knowledge of physics in general is both important 
and useful. 

But equally important is the mastery of the main tool of theoret¬ 
ical physics, mathematics; whatever branch of mathematics one 
invokes, one must fully master its basic idea and general method. 
As Lev Landau remarked half-jokingly, it is possible to be a theoret¬ 
ical physicist without really knowing physics, but not without 
knowing mathematics. 

The sheer volume of essential mathematical knowledge is the main 
obstacle in the study of theoretical physics. Add to this that text¬ 
books on mathematics set forth the material as the mathematicians 
see right, which is not how theoretical physicists need it. 

This book attempts to present theoretical physics in a manner 
that would require the student to have the barest reasonable min¬ 
imum of advance mathematical knowledge: elements of infinitesi¬ 
mal calculus, the beginnings of analytical geometry, and vector 
algebra. The course sets forth basic information in vector analysis, 
matrix, tensor and spinor algebra, and a small section devoted to 
spherical functions. These mathematical explications are in part 
incorporated in the main text, in part in the exercises, where the 
respective problems are provided with worked solutions, which 
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diSer from the study material in the conciseness of their presenta¬ 
tion. Many physical problems are also provided with such solutions. 
The best way of setting forth study material is always open to debate. 
If one attempts to sidestep complex or subtle questions or oversimpli¬ 
fy the argumentation in the hope that the unsophisticated reader 
will fail to notice a certain “sleight of hand”, the best that can be 
achieved is merely an illusion of understanding, suitable only in the 
reading of popular science books. 

On the other hand, a text overburdened with clarifications and 
reservations, with lengthy discourses on every issue and what is 
known in mathematics as “epsilontics”, is capable of confusing the 
student and obscuring the essentials behind trivialities. 

The art of the theoretician as a teacher consists in an ability to 
find the optimum mode of presentation, guided by his prospective 
audience or readership. 

In order to gain a real understanding of theoretical physics it is 
essential to constantly bear in mind the fundamental ideas of the 
subject, the purpose of the specific discourse or computation, and 
the connections among all the details and general principles. Further¬ 
more, in theoretical physics there is no knowledge without practical 
skill: only he understands the subject who has properly mastered 
its methods. Passive digestion is impossible here, simply memoriz¬ 
ing is useless. 

This course consists of two volumes. The first sets forth the funda¬ 
mental laws of physics. The second, the statistical laws of large 
assemblies of particles—gases, liquids, solids. The laws that emerge 
in such assemblies are based on the laws of elementary interactions 
and the properties of large numbers. 

The presentation of the subject matter adopted here appears to 
be most suitable for the purposes of this book. There is virtually 
no information of a purely abstract character, containing only 
summaries of results or mentioning new ideas without expounding 
upon them. Whatever is asserted is deduced from general laws. 
In each chapter these laws precede everything else: it is hardly ex¬ 
pedient to set forth basic principles after all the specifics are already 
known. 

I have attempted, as far as possible, to set forth the material in 
my own way. However, I have borrowed freely from the many- 
volumed encyclopedic Course of Theoretical Physics by L.D. Landau 
and E.M. Lifshitz, as well as from The Principles of Quantum Me¬ 
chanics by P.A.M. Dirac. 


A.S. Kompaneyets 
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PART I 


MECHANICS 


1 


GENERAL REMARKS 

Mechanics is the bedrock of all theoretical physics. Starting out 
with certain introductory propositions, we shall present the equa¬ 
tions of mechanics in the form most convenient for solving the 
concrete problems of dynamics and allowing for generalization over 
other domains of physical theory. 

Frames of Reference. To describe the motion of a mechanical 
system it is necessary to specify its position in space as a function 
of time. It is meaningful to speak only of the relative position of 
any body. For example, the position of a ship at sea is given by its 
latitude and longitude relative to specified points and lines on the 
globe. The position of a body in space can be measured relative to 
the sun or the centre of the Galaxy, etc. 

Besides stating the coordinates of the bodies of a system relative 
to a selected coordinate system it is necessary to specify the time 
at which the coordinates assume the given values. In other words, 
one needs a clock. Usually this is a uniform periodic process, natural 
or artificially reproduced in a mechanism, such as the rotation of the 
earth around its axis. 

The intuitive concept of a single universal time to which we are 
accustomed in everyday life is true only when the relative velocities 
of all bodies are small in comparison with the velocity of light. 
It is in the framework of this approximation that Newtonian me¬ 
chanics is valid. In the more general case it is necessary to state the 
system of bodies in which the clock used for measuring the time is 
fixed. If the coordinates stating the positions of moving bodies are 
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fixed within the same system, it is said that a definite frame of refer¬ 
ence has been defined. (Thus, the position of a ship at a specific 
time is verified against a clock located somewhere on the shore.) 
In Newtonian mechanics it is assumed that the readings of clocks 
are the same in all reference frames. We should remember, however, 
that this is a property not of time in general but only of reference 
systems of bodies moving slowly relative to one another. Here we 
shall accept this approximation. The generalizations will be made 
in Part II. 

Newton’s Second Law. Motion in mechanics consists in changes 
in the mutual spatial configurations of bodies with time, relative to 
a certain selected frame of reference. In formulating the laws of mo¬ 
tion an extremely convenient concept is that of a mass point , or 
particle , that is, a body whose position in space is fully defined by 
three Cartesian coordinates. Strictly speaking, this idealization is 
inapplicable to any real body. Nevertheless, it is quite reasonable 
when a body’s motion is sufficiently well defined by the displacement 
of any of its points and is independent of its rotations or defor¬ 
mations. 

The earth’s motion around the sun is independent of its rotation 
around its axis, while the flight of a bullet is strongly dependent 
on its rotational motion. In this sense the earth approximates the 
concept of a mass point more closely than a bullet. The absolute 
dimensions of bodies are immaterial. 

If we proceed from the concept of a mass point as the fundamental 
entity of mechanics, the law of motion ( Newton's Second Law) is 
formulated thus: 

m-g-F (1.1) 

Here, F is the resultant of all the forces applied to the particle 

d 2 t 

(the vector sum of the forces) and ^ * s the acceleration vector the 
Cartesian components of which are 

<Px (Py d 2 z 

IF ’ "IF ’ ~dF 

The quantity m involved in the equation characterizes the mass 
point and is called its mass. 

Force and Mass. Equation (1.1) is the physical definition of force. 
It should not, however, be seen as a simple identity or designation, 
because underlying it are a number of assumptions concerning the 
laws of motion. These assumptions are confirmed by the totality of 
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experimental data regarding mechanical motion. For example, the 
fact that Eq. (1.1) involves a second derivative with respect to 
time means that its solution requires a statement of the initial 
values of the coordinates and velocities, that is, the first derivatives, 
which is sufficient to determine the coordinates at all subsequent 
instants. Obviously, this fact can be deduced only from experi¬ 
mental data. 

Equation (1.1) essentially defines the mode of interaction between 
bodies, indicating that it is effected in the form of forces imparting 
acceleration. This assertion, too, obviously derives from a generaliza¬ 
tion of experimental data. 

Newtonian mechanics makes a limiting assumption regarding 
force. It is assumed to depend only upon the mutual configuration 
of the bodies at the instant to which the equation refers, and to be 
explicitly independent of their configurations at preceding instants. 
As we shall see later in Part II, this assumption is valid only when 
the velocities of the bodies are small in comparison with the velocity 
of light. At large velocities the very definition of interaction changes 
substantially and cannot be expressed in the form (1.1). 

Equation (1.1) involves the quantity m characterizing the body, 
its mass. The mass of different bodies can be compared according 
to the acceleration which one and the same force imparts to them. 
The greater a body’s acceleration the less its mass. The mass of 
some body may be chosen as a standard, the choice being quite in¬ 
dependent of the standards of length and time. The dimensions, or 
unit of measurement, of mass is thus a special one, unrelated to the 
units of length and time. 

The properties of mass are established experimentally. Firstly, 
when two identical bodies are joined together, the result is a body 
of double mass in comparison with either of them. A duly stretched 
spring imparts to such a composite body half the acceleration it 
would to either of its components. In other words, mass is an addi¬ 
tive quantity, as is said when any quantity characterizing a body as 
a whole equals the sum of those quantities for all its parts separately. 
Experience indicates that the principle of additivity of mass is also 
applicable to bodies made up of different substances. 

Note that the widespread definition of mass as quantity of matter 
is meaningless since it does not state the manner in which the quan¬ 
tity is measured. The definition of mass from Eq. (1.1), on the 
other hand, contains such a statement. 

In Newtonian mechanics the mass of a body is a constant quan¬ 
tity which does not change in the body’s motion. But the additivity 
and constancy of mass follow solely from experimental data and is 
in no way self-evident. These data are restricted to a specific domain 
of phenomena, namely those in which the forces of interaction do not 
accelerate bodies to speeds comparable with the velocity of light. 
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In other words, the interactions should not, in a way, be too strong. 
In atomic nuclei, where interactions are strong, the principle of the 
additivity of mass holds to an accuracy of only fractions of a per¬ 
centage point. 

We may note that if, instead of subjecting a body to the force of 
a stretched spring, it is subjected to the action of gravity, the accel¬ 
eration of a body of double mass is equal to the acceleration of each 
mass separately. It must be concluded from this that the force of 
gravity is, for some reason, itself proportional to the mass of a body. 
It is this exceptional property of gravity that lies at the basis of 
Einstein’s theory of gravitation. 

Inertial Frames of Reference. Equation (1.1) involves the accelera¬ 
tion of a particle. It is meaningless to speak of acceleration without 
stating the frame of reference in which it is measured. It is there¬ 
fore necessary to establish what, in each specific case, causes the 
acceleration. In other words, one must determine whether the accel¬ 
eration is due to interactions between bodies or to the motion of the 
reference frame itself. For example, the jolt felt by a passenger when 
a train brakes suddenly is evidence of the train’s nonuriiform motion 
relative to the earth. No one on the platform feels the jolt. This 
means that the passenger’s acceleration cannot be ascribed to in¬ 
teraction forces. Thus, a reference frame fixed with respect to the 
earth is distinguished by the property that in it accelerations of 
bodies are due solely to their interactions, for example, to the action 
of the force of gravity. Finer effects associated with the rotation of 
the earth will be discussed elsewhere. 

It can be supposed that there exists an ideal frame of reference in 
which all accelerations of bodies are due solely to forces of interaction. 
Obviously, a frame connected with the earth approximates such 
an ideal system more closely than one connected with the train. 

Whether a given force is due to interaction between bodies can be 
determined with the help of Newton's Third Law : such forces are 
equal in magnitude and opposite in sense for any pair of interacting 
particles. This is valid only if the forces are transmitted instanta¬ 
neously, an assumption which Newtonian mechanics permits. 

If the acceleration of bodies in a given reference frame is due solely 
to their interactions, such a reference frame is called inertial. In an 
inertial frame of reference, a free material point not subject to the 
action of any other bodies moves uniformly in a straight line. 

The direction of gravity on the surface of the earth is determined 
with the help of a plumb line. But a rock dropped from a tall tower 
does not fall directly along the plumb line: it is deflected slightly 
to the east. Consequently this acceleration component is not due to 
the attraction of the earth, which proves the noninertiality of a 
reference frame connected with it. 
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Ideal Constraints. Bodies in contact give rise to forces of interac¬ 
tion which can be described with the help of the kinematic concept 
of ideal constraints. Such constraints cause the points of a mechanical 
system to move along definite surfaces. If, for example, two material 
points are joined by an ideal inextensible constraint which holds 
them at a constant distance, any one of them moves along a sphere 
with the other point at its centre. 

In the most general case the restrictions imposed by constraints 
cause bodies to move in curved paths. Such motion is always acceler¬ 
ated. The acceleration can be formally ascribed to forces called the 
reaction forces of the ideal constraints. These forces are not given in 
advance as functions of the positions of the points. Integration of 
equations of type (1.1) for the case of additional restrictions imposed 
by constraints gives the reaction forces. In the next section we shall 
consider a method whereby the reaction forces can be bypassed in 
solving equations of motion. 

Besides reaction forces, motion along a rigid surface also leads to 
the appearance of friction forces. Their importance in applied 
mechanics is extremely great. But in motion with friction motion 
is imparted not only to the body as a whole but to its component 
molecules as well. The interaction between surfaces slipping over 
one another is extremely complex and acquires the form of a certain 
force of interaction only as a result of averaging over individual 
molecules. In this part we consider the fundamental laws referring 
to separate material points (particles), not to large associations of 
molecules. Friction forces are, accordingly, ignored. They are studied 
in detail in courses on theoretical mechanics. 


2 


LAGRANGE EQUATIONS 

Equation (1.1) was written in Cartesian coordinates. But any coor¬ 
dinate system is a matter of free choice, which means that when we 
describe some natural law in it we introduce an element of arbitrari¬ 
ness. Furthermore, we are also free to adopt a reference frame of our 
choice. The velocities of mass points relative to different reference 
frames are different. But it is desirable to formulate natural laws 
in such a way as to exclude, as far as possible, quantities which by 
definition refer to the observer (for example, coordinates) or, in 
other words, to exclude the element of arbitrariness from the 
description. 
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For this we must pass from the differential law (1.1) to an integral 
one. The value of an integral does not depend on the variables in 
which it is computed (for example, the area of a figure is the same 
whatever coordinates are used to calculate it: rectangular, polar, 
etc.). We can therefore hope to formulate the laws of mechanical 
motion in a way that would reduce them to statements involving 
integral expressions describing a certain finite segment of the motion. 

This is possible under the following conditions in mind: 

(i) The constraints are ideal, that is, there are no friction forces. 

(ii) The interaction between mass points can be represented in 
the form 


dU 
dr i 


( 2 . 1 ) 


where the subscript i refers to a mass point, and the vector quantity r* 
represents a vector with components dUldx *, dU/dyi, dUIdzi. 1 The 
quantity U is the same for the whole of the mechanical system. 
Its meaning will be discussed later. 


Hamilton's Principle. Condition (2.1) is not as restrictive as it 
might appear. It holds for gravity forces, electrostatic forces, elastic 
forces, that is, precisely those to which Newtonian mechanics apply. 
From now on we shall express forces in the form (2.1). For the^sake 
of simplicity, in subsequent formulas we shall assume that there is 
only one ideal constraint. This restriction is of no great significance 
since the transition to the case of several constraints is performed 
directly. We write the constraint condition in the form of the equa¬ 
tion 


F (r lf • • r ; , . . .) = 0 (2.2) 

Now consider a certain change in the coordinates of the mass 
points of the system, dr*, which we assume to be infinitesimal. The 
change is not due to the motion of the points and can be treated as 
a purely speculative operation. It should not, however, violate 
condition (2.2), that is, it should be compatible with the constraints 
imposed upon the system. For example, if the points are compelled 
to move along a surface, the changes dr* are taken along the surface, 
while being completely arbitrary in other respects. On the other 
hand, if as a result of the displacements the points remain on the 
surface defined by Eq. (2.2), the displacements satisfy the obvious 
condition 

F(... n + Sn r, ...)=S|^ 6r * = 0 ( 2 ‘ 3 > 


1 More on differentiation with respect to a vector in the introduction to 
Part II. 
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Here use was made of the fact that the quantities 6r t are infinites¬ 
imal, so that F expands in a Taylor series up to first derivatives 
inclusive. 

Consider a set of differential equations (1.1) with the supplemen¬ 
tary condition (2.2). This condition means that not all the variables r* 
are independent. To make the number of independent variables 
equal to the number of equations we multiply each equation by the 
corresponding quantity 6r* and add them. We resolve force F* into 
two components: F t = —dU/ri -f- FJ. The first component is due 
to the interaction between the material points, the second describes 
the forces due to the action of the constraints. 

We now make use of the condition that the constraints are ideal. 
We start with the simplest case of a smooth unchangeable surface 
along which a mass point moves. The reaction force is perpendicular 
to the surface, that is, the scalar product of vectors F^ and 6r f is 
zero, defining, as it does, work done in the displacement of the 
mass point along the surface, that is, the work of friction forces, 
which we excluded in advance when we assumed the constraint to 
be ideal. 

In the case of two or, in general, several mass points, the ith 
components F^Sr* need not all separately vanish, since the points 
may perform work on one another. For example, if two points are 
joined by an ideal inextensible constraint and one of them is in 
some way accelerated, it will draw the other point after it, that is, 
perform work on it. Thus, in a system of several mass points joined 
by ideal constraints the condition 


2 F-6r ; = 0 


(2.4) 


is imposed on the reaction forces, the displacements being subject 
to Eq. (2.3). 

But it then follows from Eqs. (1.1) and (2.4) that the equation 

+ < 2 - 5 > 


should hold for all displacements compatible with the constraints, 
that is, satisfying Eq. (2.3). One of the displacements fir* can be 
excluded from the latter and substituted into (2.5), after which all 
the other displacements, obviously, become independent. 

It is more convenient to use the method of undetermined multipliers, 
since it becomes possible to preserve the symmetry of the formulas 
with respect to all 6r*’s. Multiply Eq. (2.3) by a factor a and add 
the result to (2.5) to get 

vn / d 2 Tj , dU , OF \ * n 
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Since a is arbitrary, we have introduced an extra parameter into 
the equation, thanks to which we may consider all the displace¬ 
ments to be quite independent of one another. It is therefore possible 
to put all 6r ft ( k i) equal to zero, except for one, 6r*. Then there 
remains 


/ cPr* . dU dF \ * n 

( m V + ^ + a a^)^ = 0 


dU 


dF 


( 2 . 66 ) 


Thanks to the parameter a, no binding conditions are imposed 
upon fir* anymore, and it can now be treated as completely arbitrary. 
But it is then possible to put any two components of vector r* equal 
to zero, for example y t and z iy and cancel out the nonzero component 
6x iy which yields 


d 2 xi . dU dF A 
m ‘^-+a7T+ a "° 


(2.7) 


In the same way we obtain a similar equation for any component. 
In vector form the equation is written as follows: 


m t 


d 2 Ti 

dt 2 


dU , dF n 

i*7 + a *7 = 0 


( 2 . 8 ) 


the subscript i numbering all the mass points of the mechanical 
system. Together with Eq. (2.2), Eqs. (2.8) make possible the deter¬ 
mination of all rj’s (as functions of time) and the parameter a. 
Note that from (2.3) the products —a(dF/dri) are in fact the reac¬ 
tions of the constraints. 

We shall now formulate Hamilton's principle. For this transform 
the first term in Eq. (2.5) by parts: 


d 2 r; c d / dr; 2 \ dr; d * 

w 6r ‘ = sr ( 1 mt ^r 6r ‘) - : m iITnr 6r < 


(we restrict ourselves for the time being to one term). Note that 6r* 
denotes the difference between two radius vectors taken at the 
same time. The derivative of a difference equals the difference be¬ 
tween the derivatives, so that 




Taking advantage of the fact that the symbol 6 refers to an in¬ 
finitesimal difference, we rewrite the equation as follows: 


d 2 T; * d / dr; c \ c 

*5" 6r| = '* ( m *'sr 6rt ) —6 


rrij / dr t \2 
2 \ dt ) 


We sum the obtained expression over the mass points, that is, 
over i. Thanks to the smallness of 6r f , which can be regarded here as 
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the differential of the coordinate, the sum is 

2i|6r, = 6£/ 

i 

Collecting terms in Eq. (2.5), transformed in the way described, 
and taking the symbol 6 outside the summation sign, we obtain 

< 2 - 9 ) 

i i 

We assume now that the system is displaced according to the laws 
of mechanics from some given initial position which it occupied at 
time t = t 0 to another, also given, position at time t x . Since the 
two positions are given, all 6r*’s must vanish at t 0 and t ± : 

(foi)t=to — (^ T i)t=ti — 0 ( 2 . 10 ) 

Integrate the expression on the left-hand side of (2.9) from £ 0 
to t x . The total derivative with respect to time in this case reduces 
to the difference between the values of the differentiated quantity 
at the limits: 

s-[(■**■)« 

i 

- J 5 [ 2 t ( S ! ) 2 - c ']*- 0 < 2 - u > 

<0 i 

But as we have pointed out, at the limits dr* vanishes. Besides, 
the symbol 6, which denotes the difference between function values 
at the same instant, can be interchanged with the time integral for 
the very same reason that it is interchangeable with the time deriva¬ 
tive. Denoting the integral itself by S, we arrive at the following 
equation: 

65 = 6 j [2^-(^) 2 -C/]^ = ° (2.12) 

*0 i 

Since we used Eqs. (1.1) to derive Eq. (2.12), the integral in the 
expression for S is taken along the actual path.] The symbol 6 in 
front of the integral sign indicates that another integral was simul¬ 
taneously evaluated along an infinitesimally close path spaced 
fir* apart from the actual path for the ith particle. Such a close-lying 
path is said to be varied , and the symbol 6 is the variation of the 
given quantity. 

A variation has an entirely different meaning than a differential. 
The latter refers to the change in a quantity along the path of a 


2-0452 
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moving system, whereas the former corresponds to a transfer from 
one path to another, lying close to the initial one and compatible 
with the constraint imposed upon the system. A differential is 
determined from the equations of motion, a variation is subject 
only to the constraints and is otherwise arbitrary. 

Equation (2.12) shows that the integral S taken along the actual 
path of the system possesses an extremum, since it does not change 
in passing over to a close path. Accordingly, near the extremum 
a function does not change its value when its argument is changed. 

Instead of Eqs. (1.1) we can proceed from Eq. (2.12) as a funda¬ 
mental proposition of mechanics. Such an approach may seem con¬ 
trived. Actually, as we shall soon see, it opens the way to very broad 
generalizations. Besides, the equations of motion derived from 
condition (2.12) as a basic principle of mechanics can be much more 
suitable in various applications than the initial set of Eqs. (1.1). 
The quantity S is called the action of a mechanical system , and the 
assertion that S has an extreme value along the actual path is known 
as Hamilton s principle. In some cases the principle can be formulat¬ 
ed in simpler terms, when it is known as the principle of least action 
(see Exercise at the end of Section 21). 

Degrees of Freedom of a Mechanical System. In order to go over 
from rectangular coordinates to another coordinate system that is 
more convenient for solving certain mechanical problems, we must 
first formulate some essential general definitions. 

Any independent parameter that defines the position of a mechani¬ 
cal system in space is known as its degree of freedom. The number of 
such independent parameters is called the number of degrees of freedom 
of the system. 

The position of a single mass point in space is given by specifying 
three independent parameters (its coordinates) measured relative 
to a certain frame of reference. The position of N material points 
not joined by ideal constraints is defined by 3 N independent para¬ 
meters. 

But if the configuration of the points is in some way secured, the 
number of degrees of freedom may be less than 3 N. For example, 
if two particles are joined by an ideal stationary constraint, their 
six Cartesian coordinates y 1? z ly x 2 , y 2 > z< d are subject to the 
condition 

(Xj — X ,) 2 +(y l — y 2 ) z + (z, — z 2 ) 2 = 

where R l2 is the given distance between the points. Consequently 
not all the Cartesian coordinates are independent parameters: only 
five of these six quantities are independent. In other words, a system 
of two mass points at a constant distance from one another has five 
degrees of freedom. 
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If we consider three mass points rigidly joined by a triangle, the 
coordinates of the third point must satisfy two equations analogous 
to the one written above, with the quantities R\ 3 and R% 3 in the 
right-hand sides. Thus the nine coordinates of the apexes of a rigid 
triangle are subject to three equations, and only six parameters are 
independent. The triangle has six degrees of freedom. 

The position of a solid body in space is fully defined by three 
noncolinear points. Such three points, as was just shown, are given 
by six parameters. Hence, an arbitrary solid body has six degrees 
of freedom, provided only motions in which the body does not 
undergo deformation are considered (for instance, the rotation 
of a top). 

Generalized Coordinates. The example of points joined by cons¬ 
traints shows that it is not always convenient to describe the posi¬ 
tion of a system in Cartesian coordinates, as this requires the writing 
of supplementary conditions due to the constraints. The choice of 
parameters needed to define the configuration of all the points of 
a mechanical system must be based primarily on considerations of 
expediency. Thus, if the forces depend only on the distances between 
the points, it is reasonable to introduce those distances into the 
dynamical equations explicitly and not in terms of Cartesian coor¬ 
dinates. 

A mechanical system can be described by coordinates whose num¬ 
ber is equal to the number of degrees of freedom of the system. These 
coordinates may sometimes coincide with the Cartesian coordinates^ 
of some of the particles. For example, in a system of two rigidly 
connected points, these coordinates can be chosen in the following 
way: the position of one of the points is given in Cartesian coordi¬ 
nates, after which the other point will always be situated on a sphere 
whose centre is the first point. The position of the second point on the 
sphere may be given by its longitude and latitude. Together with 
the three Cartesian coordinates of the first point, the latitude and 
longitude of the second point completely define the position of the 
system in space. 

For three rigidly connected points, it is necessary, in accordance 
with the method just described, to specify the position of one side 
of the triangle and the angle of rotation of the third vertex about 
that side. 

The independent parameters which define the position of a me¬ 
chanical system in space are called its generalized coordinates. We will 
represent them by the symbols q aJ where the subscript a signifies 
the number of degrees of freedom. 

The Lagrange Equations. Since generalized coordinates are in¬ 
dependent, there is no need to impose constraint conditions upon 
them. This is one of their advantages over Cartesian coordinates 

2* 
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in solving dynamical problems. Another advantage appears when 
the generalized coordinates correspond to the symmetry properties 
of the system concerned. That is the case of spherical coordinates 
in the motion of a particle in a central force field. We shall now 
show how to write the equations of motion in generalized coordi¬ 
nates. 

We could go straight over to them in Eqs. (1.1), but that is a 
cumbersome and not easily visualized procedure. It is much more 
convenient to proceed from Hamilton’s principle (2.12). Since the 
generalized coordinates of a system fully define its spatial con¬ 
figuration, they can be used to express the Cartesian coordinates of 
its points. Let the transformation from Cartesian to generalized 
coordinates proceed according to the formulas 

Xi = Xi (. . . q a . . .) (2.13) 

Differentiating, we obtain the expression for the Cartesian velocity 
components in terms of the derivatives dqjdt, called the generalized 

velocities. Instead of dqjdt we write, more briefly, q a . We then 
obtain 



a 


(2.14a) 


The summation is over all values of a, that is, over all the degrees 
of freedom of the system. 

It will be readily observed that the index with respect to which 
the summation is performed is involved twice in the right-hand side 
of equation (2.14a): in the partial derivative and in the generalized 
velocity, the two quantities being multiplied. In such cases we shall 
not in future write the summation sign, assuming that the involve¬ 
ment of an index in a product twice signifies that the summation has 
been carried out. Such notation is not only space-saving but, given 
some practice, more easily visualized, since the formulas are not 
cluttered with summation symbols. The index with respect to 
which the summation is performed is called a dummy index. It can 
be redesignated on one side of the equation without touching the 
other. For example 

dxi (1 Xi * dxi * 

~df ~ ~dqj ' /a = dgjT <Ifi 

The thing is that both a and P assume the same set of values, and 
it therefore does not matter what letter we write. 

Let us now express x t and dxjdt, which are involved in the in¬ 
tegrand of S , in terms of the generalized coordinates and velocities 
according to formulas (2.13) and (2.146). This expression (in any 
coordinate system) is known as the Lagrange function of a mechanical 


(2.146) 
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system and is denoted L. Thus, let there be given the action of a 
system, £, such that L = L (... q a ... q a ...). Then 

ti 

S=\L(q a , q a )dt (2.15) 

to 


For the actual path of the system it possesses an extremum; this 
property cannot depend on the choice of the coordinate system 
since it expresses a known physical law. We now vary S, but in 
generalized, not Cartesian, coordinates. Since the former are in¬ 
dependent, this is also true of their variations. We have 


8S = 


•i 


n a 



dt = 0 


(2.16) 


Making use of the fact that the variation and differentiation sym¬ 
bols are commutative, we write 


dL c’ dL - dq a 

— 6g a = — 6 - * a 


dL d 


dq a 


dq<x 


dt 


dq a 


dt 


6?a 



(2.17) 


We integrate the total derivative with respect to time and sub¬ 
stitute the limits. But at the limits the variations of the coordinates, 
as before, vanish, so that the following equation remains: 


n 



to 



(2.18) 


Variations are mutually independent and arbitrary. We first 
put all 8? a ’s with the exception of 8q x equal to zero. Then Eq. (2.18) 
retains only the first term in the summation with respect to a: 

< 219 > 

to °Qi 

We now take advantage of the arbitrariness of the variation 6^. 
Suppose that the quantity in brackets, by which 8q x is multiplied, 
in some way changes its sign and absolute value but does not vanish 
over the integration interval. We now select 8q x such that it is 
everywhere of the same sign as the expression in brackets. Then the 
integrand is positive, so that 6 S cannot vanish. Hence, for Hamil¬ 
ton’s principle to hold the expression multiplied by 6^ must of 
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necessity vanish. Using the same reasoning for an arbitrary 8q aj 
we find that 


d dL dL 

dt dq a dqa 


( 2 . 20 ) 


This is the equation of motion written in forms of generalized 
coordinates. The set of Eqs. (2.20) is known as the Lagrange equa¬ 
tions of motion. 

If the number of degrees of freedom is n , then to integrate the 
Lagrange equations, which are of the second order with respect to 
time, 2 n initial conditions must be given: n generalized coordinates 
and n generalized velocities for time t 0 . Each generalized coordinate 
will then be a function of time, the initial velocities, and the initial 
coordinates: 


(7a — (7a {t\ (7oi» • • • » (7o n» Qoh • • • » (7o n) (2.21) 

Differentiating these equations with respect to time, we obtain 
the generalized velocities as functions of the same quantities: 


(7a — (7a (£» (7oi» • • • » (7on> (7oi» • • •» (7o n) (2.22) 

Elimination of all the initial values of the coordinates and veloc¬ 
ities, that is, solution of Eqs. (2.21) and (2.22) with respect to the 
initial coordinates and velocities, yields 2 n equations of the form 

<7oa ( t ; q u q n ; qi, • • •, q n ) = qo a = constant (2.23) 

Functions of the coordinates and velocities of a system that re¬ 
main constant throughout the motion are known as the integrals 
of the motion. In the right-hand side they can have any constant 
coordinates and velocities, which need not necessarily be the initial 
ones. Determination of the integrals of the motion is one of the 
problems of mechanics. 


The Determinancy of the Lagrange Function. As is apparent from 
its definition, the Lagrange function (or simply Lagrangian), con¬ 
tains two terms: 

i =ST-(Tr) 2 - c ' < 2 - 24 “> 

i 

The first term, which is quadratically dependent upon the veloci¬ 
ties, is called the kinetic energy of the system; the second, which 
describes the interaction between particles, is called the potential 
energy. The meaning of both will be made clearer in Section 4. 

The Lagrange equation (2.20) involves not the function L itself 
but its derivatives. This gives rise to the question of the determinancy 
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of L, that is, of possible supplementary terms not affecting the 
equations of motion. It is obvious, for example, that in any case 
an additional constant term does not affect the equations of motion; 
also, a total time derivative of any function of all g a ’s and f s can 
be added to the Lagrangian without in any way affecting the system 
of equations (2.20). 

This is easily verified by direct substitution as well as in the fol¬ 
lowing simple way. The term df (g a , t)/dt, which has the form of 
a total derivative, can be integrated; as a result it is added to the 
action in the form of a difference between the function values at the 
limits: 


ti u 

+ “df) dt= j Ldt + f (q a j t) 

to to 


(2.25) 


But since the variations of the coordinates vanish at the limits, 
these values remain constant in the variation of q a . Hence the 
derivative of a function of coordinates and time is not involved in 
the variation of the action and does not affect the equations of 
motion. This property can be used to determine the form of L, if 
it is not given in advance as (2.24a), on the basis of Hamilton’s 
principle and certain other general propositions of mechanics. 


The Principle of Relativity. The concept of an inertial frame of 
reference was defined above as a frame in which all the accelerations 
of particles are due solely to interactions between them. Suppose 
we have such a frame. Then all other inertial reference frames must 
be moving uniformly in a straight line relative to it. Otherwise 
bodies moving relative to the initial frame with a velocity constant 
in magnitude and direction would be found to be moving with an 
acceleration relative to another reference frame. But in that case 
the latter would not by definition be inertial. 

Thus, all inertial reference frames are in rectilinear uniform motion 
relative to one another. Any one of them can be legitimately assumed 
at rest, and all the others moving. The equations of motion of a 
mechanical system have the same form in any inertial reference 
frame. A common example is that of a passenger in a train travelling 
at a uniform velocity: he sees all physical phenomena in the coach 
exactly the same as if the train were at rest. It would be better to 
say that this is not an example demonstrating the equivalence of two 
inertial frames of reference but experimental proof of the fundamental 
mechanical principle known as the principle of relativity . As applied 
to Newtonian mechanics, which reflects simple facts known to us 
from everyday life, the principle seems self-evident. But when it was 
applied to the theory of electromagnetism, it led to a fundamental 
revision of physical concepts (see Part II). 
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The Symmetry of the Laws of Motion. The property whereby an 
equation expressing a known physical relationship retains its form 
in a transformation is known as symmetry with respect to that trans¬ 
formation. The relativity principle declares that the equations of 
motion are symmetrical with respect to the substitution of one 
inertial frame for another. Experience shows that the laws of me¬ 
chanics possess other forms of symmetry as well. 

In a mechanical system that is sufficiently distant from other 
bodies motion is always the same wherever the system is located. 
This means the following. Let there be two identical mechanical 
systems with identical initial conditions of motion, both of which 
are very far away from any other bodies capable of affecting them. 
In that case, if they are taken in the same reference frame, motion in 
them occurs in strictly the same way. In other words, the motion is 
not affected by the transfer of all the moving bodies over the same 
distance, along parallel lines, at the same time. This assertion is, of 
course, based on the vast experience accumulated by mechanics in 
the whole course of its development. More briefly the property is 
known as the homogeneity of space. 

Two equivalent mechanical systems like the ones described here 
can be taken not only displaced relative to one another but also 
turned through any angle. Again, if the two systems are sufficiently 
far away from all bodies capable of affecting them, motion in them 
takes place in the same way. In other words all directions in space 
are equivalent. This property of space is known as the isotropy of 
space . Like homogeneity, the isotropy of space also follows from the 
sum-total of experience. Homogeneity and isotropy are an expres¬ 
sion of specific properties of the laws of motion: their symmetry with 
respect to displacements and rotations. Mathematically, displace¬ 
ments and rotations in space are represented by corresponding 
transformations of the coordinate system. 

There is one more type of symmetry of the laws of motion. They 
are homogeneous with respect to time transfer: the laws of motion 
do not change with time. If this property of the laws of motion of 
mechanical systems did not hold, it would be impossible to design 
any machine. 

Determination of the Form of the Lagrangian. The laws of sym¬ 
metry of motion listed above, that is, space and time homogeneity, 
space isotropy, the relativity principle, and Hamilton’s principle, 
can be used to determine the form of the Lagrangian without pre¬ 
liminary reference to Eqs. (1.1). 

Let us start with a free particle sufficiently far away from all 
other bodies (which is the definition of a free particle). By virtue 
of space homogeneity, its Lagrangian cannot be explicitly dependent 
on the coordinates, since otherwise at different spatial points the 
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particle would move according to different laws. For the same reason 
the Lagrangian does not explicitly involve time, and this refers not 
only to an individual free particle but to any assembly of particles 
not subject to external forces. Thus the Lagrangian of a free particle 
can depend only upon its velocity. But L is a scalar quantity. A sca¬ 
lar can be obtained from a vector in one of two ways: by taking 
the absolute value of the vector or by multiplying it scalarly by 
another vector. But there is no such preferred vector in isotropic 
space since all directions in it are equivalent. Thus the only possible 

form of the Lagrangian of a free particle is L = L (| r |). 

It remains to determine what the function is. According to the 
relativity principle the character of motion should not change in 
passing to another inertial reference frame. As was pointed out, the 
latter must be travelling rectilinearly and uniformly relative to the 
initial one. If its velocity is V, the particle under consideration 

moves relative to it with the velocity r + V. We have made use of 
the simple law of velocity composition which, as will be shown in 

Part II, holds only when both | r | and | V | are substantially below 
the speed of light. Thus in the new inertial frame the Lagrangian is 

L = i (| r + V |). For the law of motion to remain the same the 
difference between the two expressions must be equal to the total 
derivative of a certain function of the coordinates and time. It is 
immediately apparent that for a free particle this leaves only one 
possibility: 

L = ^\t\* ' (2.26) 

where m is a constant quantity. 

Indeed, we then obtain 

d ( ,, m | V | 2 t \ 

= ^( mrV + —2^) 

What is the sign of m? Let us determine it. But first we must 
somewhat refine Hamilton’s principle by requiring that along short 
paths the action be not simply extremal but minimal. Then the 
sign of m is positive. At negative m the action could decrease limit- 

lessly with the increase of | r |. We have thus finally determined the 
first term in the Lagrangian for a free particle. 

If we now take a system of interacting particles, to describe their 
interaction we must introduce an additional term into the Lagran¬ 
gian. 
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We assumed that the action of the particles on one another de¬ 
pends only on their position at a given time. It is, however, sig¬ 
nificant that it is determined only by their relative position, that is, 
it depends only on the separation r t — r h and not on each vector 
separately. Only the differences between vectors remain constant 
in transfers of a coordinate system. In addition, only the differences 
r f — r k satisfy the relativity principle: the products Yt , which are 
added to each radius vector r*, r k in going over to another inertial 
frame, cancel out. 

Since the Lagrangian is a scalar quantity, it can depend either 
on the absolute values of the differences | r* — r h | or on scalar 
products of the type (r* — r k ) (r* — r m ). But the latter case is not 
encountered in practice and need not be considered. Hence the 
Lagrangian of a system of material points not interacting with 
other bodies is 

L=y i ^-(r l ) 2 -U(... |r,-r ft | ...) (2.246) 

i 

We have not restricted ourselves to developing the Lagrange equa¬ 
tions from Eqs. (1.1) and we have performed all of the complex 
reasoning needed to deduce formula (2.24 b) because in this way 
it is easier to arrive at the necessary generalizations required by 
Einstein’s relativity principle and electromagnetic field 
theory. 

The special significance of Hamilton’s principle in mechanics 
consists in that it makes it possible to express all symmetry prop¬ 
erties of mechanical systems in the most clear and concise form. 
Although they can be derived from the differential equations of 
motion as well, the integral principle expresses them much more 
distinctly. Since the symmetry of the conditions of motion is a 
generalization of certain experimentally established laws, Hamil¬ 
ton’s principle provides the most convenient means of formulating 
all the general laws of mechanics. It should, of course, be borne in 
mind that this formulation is a reflection of a tendency to seek the 
most concise and convenient notation, not of any natural “striving” 
for minimum action. 

Symmetry properties substantially restrict the possible behaviour 
of mechanical systems. As will be shown later in Section 4, different 
types of symmetry are associated with certain quantities (dependent 
upon dynamic variables) whose values, determined at the initial 
instant of time, are conserved. This substantially restricts the 
domain of variables in the problems considered. In a number of 
important cases these quantities are best found with the help of the 
variation principle, according to the symmetry properties inherent 
in it. 
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Hamilton’s principle, with due account of symmetry require¬ 
ments, can be used to determine the form of the Lagrangian, and 
thereby the form of the equations of motion. In this sense it posses¬ 
ses great heuristic force, that is, makes it possible to find unknown 
quantities from general considerations. 

Finally, the variation principle is extremely convenient in solving 
specific problems of mechanics with the help of the Lagrange equa¬ 
tions obtained by variation. 


3 


EXAMPLES OF CONSTRUCTING 
THE LAGRANGE EQUATIONS 

The Rules for Constructing the Lagrange Equations. Let us sum up 
the sequence of operations for developing the Lagrange equations for 
a specific mechanical system: 

(i) Express the Cartesian coordinates in terms of the generalized 
coordinates: 

%i = (^i, • • •» Qai • • •* Qn ) 

(ii) By differentiating these equalities obtain the Cartesian veloci¬ 
ty components expressed in terms of the generalized coordinates 
and generalized velocities (bearing in mind the summation rule for 
the subscript a): 

• dxi • 

Xi = W« qa 

(iii) Substitute generalized coordinates for the Cartesian coordi¬ 
nates involved in the potential energy formula: 

U (• • • | Xi Xfr \ ...) = £/ ((7i* • • •» ?a> • • •» Qn) 

(iv) Substitute the generalized velocities for the velocities in¬ 
volved in the kinetic energy formula, so that in the most general 
case the kinetic energy becomes dependent not only on the [general¬ 
ized velocities but on the generalized coordinates as well: 

r- 2 ? 2 r 

i i 

Since the Cartesian velocity components are homogeneous linear 
functions of the generalized velocities q a , the kinetic energy is 
a quadratic homogeneous function of q a and q$. 
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(v) Compute the partial derivatives dLldq a and dL!dq$, assuming 

q a and to be independent variables. 

(vi) Substitute the derivatives for all degrees of freedom into 
Eqs. (2.20). 

Let us now examine several examples of developing the Lagrange 
equations. 


Central Forces. This is the name given to forces directed along 
lines joining the mass points and dependent only on the distances 
between the particles. Actually we already considered such forces 
when we assumed that potential energy depends only on the distanc¬ 
es between points. Then the force with which the Zrth point acts 
on the ith is 


F ik = 



*i — r h| ...) = 


dU o> | r ffe | 
d I r ife I d*ik 


(3.1) 


The derivative of the scalar quantity | r t — r k | = | r ik | with 
respect to vector is a vector with components 

d 1 T ik | d 1 T ik 1 d 1 T ik 1 

dx ik ’ dy ih ’ dz ih 


For an example let us compute the component along the x axis: 




xtk 

I r ih | 


(3.2) 


But the ratio x ih /\ r ik | is the component along the x axis of a unit 
vector directed from the ith particle to the Zrth, which implies that 
the force represented by expression (3.1) is a central force. 

If one of the bodies of a system is much more massive than all the 
others (like the sun in our solar system) it can, to a certain approxi¬ 
mation, be assumed at rest, that is, the actions of the other bodies 
on it can be neglected. In such cases the body is said to produce 
a field in which the mass points are moving. 

Suppose we have a gravitational field, so that the central body 
attracts the others with a force inversely proportional to the distance 
from it. Neglecting the action of the bodies on each other, it is easy 
to find the expression for the potential energy of a particle in the 
field of the central body. The force with which it acts on the particle is 


F 


a r 
r a r 


(3-3) 


where a is a constant factor. 

Comparing (3.3) with Eqs. (3.1) and (3.2), we conclude that the 
derivative of the potential energy with respect to the distance is 


dU a 

~d7 1 72 


(3.4) 
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Integrating this equation, we find 

U = constant —y (3.5) 

The constant here is the value of the potential energy at an infinite 
distance from the attracting body. In the case of forces decreasing 
with distance this constant is usually taken to be zero, but this is 
a pure formality, since when a force is computed by differentiating 
the potential energy the constant is eliminated anyway. 



The convention concerning the choice of the constant in the poten¬ 
tial energy expression is called gauging. In this case, at infinity 
from the attracting body U is gauged to zero: 

U=—y (3.6) 

An expression analogous to (3.5) is also obtained for two points 
carrying an electric charge, so that the Coulomb law is, in form, 
like Newton’s law of gravitation. However, since it corresponds to 
two signs of charges it can denote either attraction or repulsion. 
Correspondingly, the potential energy of the Coulomb forces has 
both signs: positive for like charges and negative for opposite. 

Spherical Coordinates. Formula (3.6) suggests that in this instance 
it is best to choose r as the generalized coordinate. In other words, 
we must transform from Cartesian to spherical coordinates. The 
relationship between Cartesian and spherical coordinates is shown 
in Figure 1. The z axis is called the polar axis of the spherical coor- 
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dinate system. The angle d between the radius vector and the polar 
axis is called the polar angle; it is complementary (to 90°) to the 
“latitude”. Finally, the angle cp is analogous to the “longitude” and 
is called the azimuth. It measures the dihedral angle between the 
plane zOx and the plane passing through the polar axis and the 
given point M. 

Let us find the formulas for the transformation from Cartesian 
to spherical coordinates. From Figure 1 it is clear that 

z = r cosfl (3.7) 

The projection p of the radius vector onto the plane xOy is 

p = r sin 0 (3.8a) 

Whence 

x = p cos cp = r sin d cos cp (3.86) 

y = p sin cp = r sin 0 sin cp (3.8c) 

We shall now find the expression for the kinetic energy in spherical 
coordinates. This can be done either by direct calculation according 
to the method indicated at the beginning of this section or by a 
geometrical construction. Although the latter is simpler, let us first 
follow the computation procedure in order to illustrate the general 
method. We have 


z = r cos $ — r sin $ d 


x = r si n ft cos cp -f r cos 0 cos cp ft — r sin ft sin cp cp 

y^ r sin $ sin cp + r cos $ sin cp ft + r sin $ cos cp cp 

Squaring these equations and adding, we obtain, after very simple 
manipulations, 

T = -i- m (x 2 + y 2 + z 2 ) = (r 2 + r 2 ft 2 + r 2 sin 2 ft cp 2 ) (3.9) 

The same is clear from the construction shown in Figure 2. An 
arbitrary displacement of the point can be resolved into three mu¬ 
tually perpendicular displacements: dr, r dft and p dcp = r sin ft dcp. 
Whence 

dl 2 = dr 2 + r 2 dft 2 + r a sin 2 d dcp 2 (3.10) 

Since the square of the velocity v 2 = (dZ 2 /d^) a , (3.9) is obtained 
from (3.10) simply by dividing by ( dt ) 2 and multiplying by ml 2. 
Hence, in spherical coordinates, the Lagrangian is expressed as 

L = — (r 2 + r 2 sin 2 0 cp 2 + r 2 d 2 ) — U (r) 


(3.11) 
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Now in order to write down the Lagrange equations it is sufficient 
to calculate the partial derivatives: 

dL • dL o A™ dL 2 • 2 a ’ 

— =mr, -—=mr 2 v, - 7 - = mr 2 sin 2 ftcp 

dr dft dtp 

dL _ • 2 q. * 2 1 no dU dL _ 

— = mrsin 2 ft<p 2 + mrft 2 --—, — =0 

44- = rnr 2 sin ft cos ft <p 2 

These derivatives must be substituted intoj(2.20), which, however, 
we shall not now do since the motion we are considering actually 
reduces to the plane case (see beginning of^Section 5). 



Two-particle Systems. So far we have considered the centre of 
attraction as stationary, which corresponds to the assumption of an 
infinitely large mass. But it may happen that both masses are similar 
or equal to each other (a binary star, a neutron-proton system, and 
the like). We shall show that the problem of the motion of two bodies 
interacting only with one another can always be easily reduced to 
a problem of the motion of a single body. 

Let the mass of the first particle be m 1 and of the second m 2 . We 
call the radius vectors of these particles, drawn from an arbitrary 
origin, r x and r 2 , respectively. The components of r x are x ly y x , z 1 ; 
the components of r 2 are x 2 , z/ 2 » z 2 • We now define the radius vector 
of the centre of mass of these particles, R, by the following formula: 

_ m i +m 2 


(3.12) 
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Sometimes the terms “centre of inertia” or “centre of gravity” are 
used. However, the centre of gravity can be determined only in 
a uniform field of gravitational forces. 

In addition, let us introduce the radius vector of the relative 
position of the particles 

r = r x — r 2 (3.13) 

• • 

Let us express their kinetic energy in terms of R and r. Eliminat¬ 
ing r x and r 2 from (3.12) and (3.13), and differentiating them with 
respect to time, we obtain 


r i = R + 


m 2 


The kinetic energy is equal to 


r 2 = R- 






(3.14) 


T 


TTl | 

~T 


r 2 | m 2 2 
r lT o r 2 

* r.rj 


(3.15) 


Substituting (3.14) into it, we obtain, after a simple rearrangement, 


m m \ ~f~ m 2 D2 | m l m 2 *2 
“ 2 ^ ^ 2(fn 1 + /n 2 ) r 


(3.16) 


The cross term involving Rr has been eliminated, which is the pur¬ 
pose of the transformation. 

Since by definition there are no external forces acting on the mass 
points, the potential energy can depend only on the distance between 
them, r: U = U (r). Thus the Lagrangian is 


£ _ J^2 . 


m ^2 


l -U{r) 


(3.17) 


2 (mi -j-tti 2 ) 

Let us now write the Lagrange equations for the centre-of-mass 
coordinates. Differentiation of Eq. (3.17) yields 


=(m 1 + m 2 )R # 
dK 


dL 


dR 


Hence, in accordance with (2.20), we have R = 0, or in Cartesian 
coordinates 


X = 0, Y = 0, Z =0 
These equations can be easily integrated. Whence 

X = X 0 f + X 0 , Y = Y 0 t + Y 0 , Z = Z 0 t-\“ZQ 

where the subscript 0 corresponds to the values of the quantities 
at time zero. Combining the coordinate equations into one vector 
equation, we obtain 


R — R 0 £ -f R 0 


(3.18) 
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Thus, the centre of mass moves uniformly in a straight line relative 
to the initial reference frame. But we assumed the frame to be inertial 
because the forces in it are due only to interactions between the 
mass points [the corresponding potential energy is U(r)]. Hence, 
a reference frame fixed with respect to the centre of mass of two mass 
points is also inertial. If we pass to it, there remains the relative 
motion of both mass points, which is described by the separation r. 
In the centre-of-mass reference frame the Lagrangian has the form 

< 3 - 19 ) 


Here, obviously, r 2 = x 2 + y 2 + z 2 . This Lagrangian involves only 
three coordinates, not six as in (3.17). Consequently the problem 
of the motion of two bodies of masses m 1 and m 2 reduces to the prob¬ 
lem of the motion of one body of mass 

m = m ' m% (3.20) 

m\ -f- m 2 

which is called the reduced mass. 

Since the reference frame connected with the centre of mass is 
inertial, the motion of the centre of mass does not affect the relative 
motion of the mass points. In the next section we shall show that 
this assertion holds for any number of mass points not subject to 
external forces. It can simply be assumed that the centre of mass 
is at rest at the origin of the coordinate system, R = 0. 

If the relative motion of two mass points is described in spherical 
coordinates, the equations of motion have the same form as for 
one point moving relative to a fixed centre of attraction. 

Assuming the centre of mass of the two points to be at rest at the 
origin of the coordinate system, we find the distances of both points 
from the origin: 


TTLcy T 

Ti ~~ m x + m 2 * 


r 2 


m\r 

\ m \ 4 ~ m 2 


Thus, if one of the masses is much smaller than the other (m 2 <C m i)i 
then r x r 2 , that is, the centre of mass of the two points lies very 
close to the point of greater mass. This is the case for a system con¬ 
sisting of the earth and an artificial satellite and, to a smaller approx¬ 
imation, for the earth-moon system. The reduced mass can be 
written thus: 


m = 


M 2 

1 -f- m 2 /nn 


(3.21) 


From this it can be seen that it equals approximately the smaller 
mass. That is why the motion of a satellite can be described as if 
the earth were fixed and the mass of the satellite were independent 
of the mass of the earth. 


3-0452 
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Simple and Double Pendulums. In concluding this section we shall 
derive the Lagrangian for simple and double pendulums. We shall 
make use of them later. 

The simple plane pendulum is a mass point m suspended on a flat 
hinge at a certain point by a weightless rod of length Z. The hinge 
restricts the plane of oscillation of the pendulum (Figure 3). It is 
clear that such a pendulum has one degree of freedom. We take 



the angle of deflection of the pendulum from the vertical, <p, as 
a generalized coordinate. Obviously the velocity of the mass point 

is equal to Z<p, so that the kinetic energy is 

Acting on the pendulum is the force of gravity — mg. Here, g is 
the acceleration of free fall; the minus sign takes account of the 
fact that the force is directed downwards. Hence the potential 
energy U = mgz, and F = — dUldz, where z is expressed in terms 
of the angle as follows: z = l (1 — cos <p). Thus the Lagrangian of 
the pendulum is 

L = ^- Z 2 cp 2 — mgl (1 — cos <p) (3.22) 

Note that mass m is involved as a common multiplier of both terms 
in expression (3.22) and therefore cancels out in the equations of 
motion. It follows that the law of oscillation of a pendulum does not 
depend on its mass. This, of course, is true if all types of friction can 
be neglected. 

A double pendulum can be described in the following way: at 
point m there is another hinge from which another pendulum, which 
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is forced to oscillate in the same plane (Figure 4), is suspended. 
Let the mass and length of the second pendulum be m x and Z 1? res¬ 
pectively, and its angle of deflection from the vertical, yp. The coordi¬ 
nates of the second mass point are 

x x = l sin <p + sin yp 

Zi = l (1 — cos <p) + Z x (1 — cos xp) 

Whence we obtain its velocity components: 

x t = l cos <p q> + Zi cos \|) 
z i = L sin q) q) + /i sin yp xp 

Squaring and adding them, we express the kinetic energy of the 
second particle in terms of the generalized coordinates q), yp and the 

generalized velocities q), yp: 

Ti = -^-[Z 2 «p 2 +W + 2/^cos(cp —I])) 

The potential energy of the second particle is determined in terms 
of z x . Finally, we get an expression of the Lagrangian for a double 
pendulum in the following form: 

L = Z 2 q) 2 + Z?xp 2 + niilli cos (q) — yp) q) 

— (m + m t ) gl (1 — cos q>) — mjgZj (1 — cos yp) (3.23) 

By using generalized coordinates we completely avoided the ques¬ 
tion of the forces of reaction appearing in the hinges. 


EXERCISES 

1. Write the Lagrange equation if the Lagrangian has the form 
L= -(l — g*) 1 / 2 

We shall encounter Lagrangians of a similar type in Part II. 

2. A point moves in a vertical plane along a given curve in a gravita¬ 
tional field. The equation of the curve in parametric form is x = x(s ), z — 
= z(s). Write the Lagrange equations. 

Solution . The velocities are 

• dx • ,• • dz • 

X= -r— S = XS, Z = —— S = Z'S 

ds ds 

3 ♦ 
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The Lagrangian has the form 

L=z ~Y (s' 2 + z ' 2 ) 52 — m g z ( 5 ) 

The Lagrange equation is 

[(a:' a + z' a ) s] — ms 2 (x f x n ~hz'z") -\-mgz' = 0 
at 

3. Write the Lagrange equations for an elastically suspended pendu¬ 
lum. The potential energy of the elastic force in the case of such a pendulum 
is calculated from the formula U = (k/2)(l — Z 0 ) 2 , where l 0 is the equilib¬ 
rium length of the unextended rod, and k is a constant characterizing its 
elasticity. Use l and 9 as the generalized coordinates. 

4. Write the kinetic energy of a system of three mass points m u m 2l m 3 
in terms of the kinetic energy of motion of the three points moving together 
with the centre of mass and the kinetic energy of relative motion. 

A nswer . 

„ , m 2 (mi + m 3 ) i 9 m 3 (m, +m 2 ) • m 2 m 3 • • 

^ = - 3 —+- 2 M- p2+ - m -P3-^i7-P2p3 


where 


M = m t + + m 3 , p 2 = U — r 2 , p 3 = ^ — r 8 

R= m i r i +m 2 T2 + m 3 T 3 


4 


CONSERVATION LAWS 

In Section 2 we gave a general definition of the integrals of the 
motion. Finding all the integrals of the motion of an arbitrary me¬ 
chanical system is extremely difficult and rarely accomplished in 
analytical form. However, there are certain important integrals of 
the motion which can be written directly according to the form of 
the Lagrangian. These integrals will be examined in this section. 

Energy. Let us use the Lagrangian to determine a quantity 

E=q a —~r - L (4.1) 

d<l* 

(the summation over a is from 1 to w). The quantity E is called the 
total energy of a system. Let us calculate its total derivative with 
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respect to time: 

6E 
dt 


= q «- 


dL 


•Qa 


d dL 
dt 






Ut <^a dQa <^a 

The last three terms in the right-hand side are the derivative of the 
Lagrangian, dLldt, which is dependent on q a , q a , and in some cases 

on time. From the Lagrange equations, the quantity ( d/dt)(dL/dq a ) 
can be expressed as dL!dq a . Thus 


?«- 


dL 


dL 


dt 


dE 

du 


dL 

dt 


(4.2) 


Hence, if the Lagrangian is not explicitly dependent on time, the 
energy E is an integral of the motion, that is, it is conserved. 

Let us now consider in what cases L does not depend on t. It was 
pointed out in Section 2 that if a given mechanical system is suf¬ 
ficiently far away from all other bodies, or closed , as they say, then 
time is not involved in the Lagrangian. This expresses the homogenei¬ 
ty of time. That means that in a closed system energy is conserved, 
or as we have agreed to say, it is an integral of the motion. If there 
are two noninteracting closed systems, the Lagrangian for them 
comprises the sum of the Lagrangians of each system separately. 
Accordingly, the total energy is, by definition (4.1), expressed as 
the sum of the energies of both systems. The energy of such systems 
is an additive integral of the motion. 

Energy is conserved not only in isolated, closed systems. If there 
is a uniform external force field acting on a system, the Lagrangian 
of such a system does not involve time explicitly, so that, from (4.2), 
the energy is also an integral of the motion. When constraints are 
imposed upon a system, its Cartesian coordinates are expressed in 
terms of the generalized coordinates according to formulas that do 
not explicitly involve time. This case was examined in Section 2. 
The energy of the system here is also conserved. 

Ideal constraints can be treated as a special case of a field of force, 
so that the action of constraints not involving time is analogous to 
a unifoim field. Therefore energy is in this case conserved. But in 
a variable external field or with constraints explicitly involving time 
the energy of a system is not conserved: either work is done on the 
system or the system itself does work on some external object. 

When forces of friction act within a closed system, the energy of 
macroscopic motion transforms into the energy of microscopic, 
molecular, motion. Together with this internal energy, the energy 
of a closed system is, of course, conserved, but the Lagrangian, 
which involves only the generalized coordinates of motion of the 
system as a whole, no longer provides a complete description of the 
system’s motion. The mechanical energy of the macroscopic motion 



38 


Fundamental lawt 


alone, determined with the help of such a Lagrangian, is no longer 
conserved. Mechanical energy is transformed into the energy of 
internal (microscopic molecular) motion in friction and impact. 

Let us now show the form to which the total energy reduces when 
the Lagrangian can be represented in the form L — T — U, where 
the kinetic energy T is a homogeneous quadratic function of the 

generalized velocities, T = {H < 2)T aL $q 0L q$. Since we have assumed 
that the potential energy depends only on the coordinates, for the 
derivatives with respect to the generalized velocities we have 

dL _ dT 

Ha Ha 

so that the total energy is 

E = q a -^-L (4.3) 

Ha 

But according to Euler’s theorem on homogeneous functions the 
sum of the partial derivatives multiplied by the corresponding 
variables is equal to the function itself multiplied by the degree 
of homogeneity (this is easily verified from the function of two 
variables ax 2 + 2 bxy + cy 2 ). Hence 

E = 2T — (T — U) = T + U (4.4) 

that is, the total energy is equal to the sum of the potential and 
kinetic energies, in agreement with the conventional definition. 
This explains the names given to the functions T and U. 

Note that the definition of energy (4.1) is more general and can 
be used in the case when the Lagrangian cannot be represented as 
L = T — U (see Part II). 

Application of the Energy Integral to Systems with One Degree 
of Freedom. The energy integral allows us, straightaway, to reduce 
problems of the motion of systems with one degree of freedom to 
those of quadrature. Thus, in the pendulum problem considered in 
the previous section we can, with the aid of (4.4), write the energy 
integral directly: 

E = Z 2 cp 2 + mgl (1 — cos <p) (4.5) 

The value of E is determined from the initial conditions. For 
example, let the pendulum initially be deflected at an angle q> 0 

and released without any initial speed: so q> 0 = 0. Hence 

E = mgl (1 — cos <p 0 ) 
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Substituting this into (4.5), we have 

mgl (cos <p — cos <p 0 ) = Z 2 q> 2 (4.6) 

From this, the relationship between the angle of deflection and 
time is determined by the quadrature 

t= -(-?M 1/2 [-——nr (4.7) 

' 2^ / J (cos cp —cos cp 0 ) 1/2 
<Po 

The minus sign has been taken because at the beginning of the mo¬ 
tion the angle <p decreases. The integral involved in (4.7) cannot be 
found explicitly. 

It is significant that the oscillation law of a pendulum depends 
only on the ratio llg. The mass, as pointed out in the preceding 
section, is eliminated. Thus, a pendulum can be used to measure 
the acceleration of free fall, g, to a high degree of accuracy. 

A system in which mechanical energy is conserved is sometimes 
called a conservative system. Thus, the energy integral makes it 
possible to reduce the problem of the motion of a conservative system 
with one degree of freedom to quadratures. The fact that the quadra¬ 
ture need not necessarily be expressed in terms of elementary func¬ 
tions, as is the case in (4.7), is of no consequence. 

In a conservative system with several degrees of freedom the 
energy integral allows us to reduce the order of the set of differential 
equations by one unit and thereby simplify the integration. 


Generalized Momentum. We shall now consider other integrals of 
motion which can be found directly with the aid of the Lagrangian. 
To do this we shall take advantage of the following, quite obvious, 
consequence of the Lagrange equations. If some coordinate does not 
appear explicitly in the Lagrangian, dL!dq a = 0, then in accordance 
with the Lagrange equations 


_d_ 

dt 

But then 



(4.8) 


Pa — 


dL 

dq a 


(4.9) 


is constant, which implies that it is an integral of the motion. The 
quantity p a is called the generalized momentum corresponding to 
the coordinate with index a. This definition includes the momentum 
in the usual sense 


dL 

dv x 


dL 

dx 


Px = m V x 


(4.10) 
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Summarizing, if a certain generalized coordinate does not appear 
explicitly in the Lagrangian, the generalized momentum correspond¬ 
ing to it is an integral of the motion, that is, it remains constant 
during the motion. The coordinate itself is said to be cyclic. 

In the preceding section we saw that the coordinates A, Y, Z of 
the centre of mass of a system of two particles not subject to external 
forces do not appear in the Lagrangian. From this it is evident that 

(m i + m 2 )\X = P x 

{m i + m 2 ) Y = P y 

(jfi i ~h ^ 2 ) Yt — Pz (4.11) 

are integrals of the motion. 

Momentum of a System of Particles. The same is readily shown 
also for a system of N particles. Indeed, for N particles we can in¬ 
troduce the concept of centre of mass by means of the equation 

R = I! m i r i/Ij (4.12) 

i i 

and of the velocity of the centre of mass as 

(4.13) 

i i 

The velocity of the ith point relative to the centre of mass is (by 
the theorem of the composition of velocities) 

ri = r f — it (4.14) 

The kinetic energy of the system oi particles is 
* 7 N 

T =~~ 2 m i r « 2 m< ( r i + ^) 2 

i=l i=l 

N N N 

= 4* 2 m i*?+ R 2 m i r i+4" 2 ( 4 - 15 ) 

i=l i=l i= 1 

But from (4.13) and (4.14) it is immediately apparent that, from 

N 

the definition of rj and R, 2 m i T 'i = 0. Therefore the kinetic energy 

1=1 

of a system of particles separates into a sum of two terms: the kinetic 
energy of motion of the total mass with the velocity of the centre 
of mass relative to the adopted reference frame, and the kinetic 
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energy of motion of the masses relative to the centre of mass: 

N N 

T = T CTn -\- T ie \ = (4.16) 

i=l i=l 

The vectors r\ are not independent: as mentioned, they are govern¬ 
ed by the vector equation 2 m i T i = 0. Consequently they can be 
expressed in terms of N — 1 independent quantities by defining 
the relative positions of the ith point and some fixed point, for 
instance the first. The kinetic energy of relative motion is expressed 
in terms of the relative velocities of the particles (see Exercise 4, 
Section 3). In a closed system, by virtue of its fundamental property, 
there are no external forces acting on the particles, while the forces 
of interaction within the system depend only on the relative con¬ 
figuration of the particles, that is, on r* — r ft . 

Thus, only R appears in the Lagrangian, and R does not. Its 
components are cyclic coordinates, therefore the total momentum is 
conserved* 

N N 

P=iL = (2 TOi ) R=2 (4.17) 

1 1=1 

Equation (4.17) shows that the total momentum of a system of 
mass points not subject to the action of external forces is an integral 
of the motion. It is important that this is an additive integral of the 
motion compounded from the momenta of individual particles. 

Note that the momentum integral exists for any system subject 
only to internal forces, including friction, which cause mechanical 
energy to transform into the energy of internal molecular motion. 
This does not affect the conservation of momentum. 

If (4.17) is again integrated with respect to time the result is 
a centre-of-mass integral analogous to (3.18). This is the so-called 
second integral (since it involves two constants). It contains only the 
current coordinates, but not the velocities; (4.7) is also a second 
integral. 

Properties of the Vector Product. Further on we shall investigate 
the moment of momentum , or angular momentum, of a mass point 
and a system of mass points. For a separate mass point it is de¬ 
fined as 


M = r X P (4.18) 

Here the boldface multiplication sign denotes the vector product. 
As is known, (4.18) takes the place of the following three equations 
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for the components: 

M x = yp z — zp y 

M v = zp x — xp z 

M z = xp y — yp x (4.19) 

Recalling the geometric definition of a vector product, we cons¬ 
truct a parallelogram on vectors r and p. Then r X P denotes a 
vector equal in magnitude to the area of the parallelogram and 
directed perpendicular to its plane. To state the direction of M 
uniquely we must agree on the direction in which we trace the paral¬ 
lelogram. We shall always start from the first vector, in this case r. 
Then the positive side of the parallelogram is that in which the 
direction is counterclockwise. Vector M is normal to the positive 
side of the plane. To state this in another way, if we rotate a cork¬ 
screw from r to p, the displacement of the corkscrew itself will be in 
the direction of M. The direction changes if we interchange the posi¬ 
tions of r and p in the product. Therefore, unlike a conventional 
product, the sign of the vector product changes when the factors are 
interchanged. This can also be seen from the definition of the Car¬ 
tesian components of angular momentum. 

In order to understand why a vector product defines precisely 
a vector quantity we should clarify the definition of a vector in 
general. A vector is an aggregate of three quantities which transform 
in the rotations of a coordinate system as the components of the 
radius vector. For example, velocity is a vector, because by defini¬ 
tion it is dr/dt , and differentials of coordinates transform like the 
coordinates themselves. Consequently, momentum p is also a vector. 

If we carry out the transformation of the components of the radius 
vector (x , y , z) and of the momentum (p x , p tJ , p z ) according to the 
formulas of analytic geometry and substitute the transformed quanti¬ 
ties into (4.19), we find that the components M x , M y , M z have 
themselves transformed according to the same formulas as x , i/, z. 
For this we should make use of the known relationships for the 
cosines of the angles between the old and new coordinate axes. But 
if some three equations, in this case (4.19), retain their form in 
rotations of the coordinate system, they can be combined in one 
vector equation (4.18) (Exercise 3). 

The area of the parallelogram is rp sin a, where a is the angle 
between r and p. The product r sin a is the length of a perpendic¬ 
ular drawn from the origin of the coordinate system to the tangent 
to the trajectory whose direction is the same as p. This length is 
sometimes called the “arm” of the moment. 

The vector product possesses a distributive property, that is 

a X (b + c) = (a X b) + (« X c) 
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Hence, a binomial vector product is calculated in the usual way, 
but the order of the factors is taken into account: 

(a + b) X (c + d) = (a X c) + (b X c) 

+ (aXd) + (bXd) 

Angular Momentum of a System of Mass Points. The angular mo¬ 
mentum of a system of mass points is defined as the sum of the 
angular momenta of all the points taken separately. In doing so 
we must, of course, take the radius vectors relative to a coordinate 
origin common to all the particles. Thus 

M=2r i Xp, (4.20) 

i=i 

We shall show that the angular momentum of a system can be 
separated into the angular momentum of relative motion of the mass 
points and the angular momentum of the system as a whole, similar 
to the way it was done for the kinetic energy. To do this we must 
represent the radius vector of each mass point as the sum of the 
radius vector of its position relative to the centre of mass and the 
radius vector of the centre of mass itself; we must expand the expres¬ 
sion for the velocities of the mass points in the same way. Then, the 
angular momentum can be written in the form 

M = 2 (R + r ») X + Pi) 

i-1 

N 

“ 2 K»»|R X R) + (m,ri X R) + (R X Pi) + W X Pi)] 

i=l 

In the second and third terms, we can make use of the distributive 
property of the vector product and introduce the summation sign 
inside the parentheses. However, both these sums are equal to zero, 
by definition of the centre of mass. This was used in (4.15) for velo¬ 
cities. Thus, the angular momentum of a system of mass points is 
indeed equal to the sum of the angular momenta of the centre of 
mass, M 0 , and the relative motion of the mass points, M': 

M = R X R+ 2 r < X P» = + M' (4.21) 

i 

Let us perform these transformations for the special case of a 
system of two mass points. We substitute r x and r 2 expressed from 
(3.14) in terms of r and R into (4.20). This gives 

M^^XPi + rzX P2 

“ R X (Pt + Pa) +i™"* X P» — X Pa) 
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Further, we replace by p 2 by m 2 r 2 , and p x + p 2 by P, after 
which the angular momentum reduces to the required form: 

M = R XP+-^r)(; (4.22, 

• • 

Here, [m 1 m 2 /(m 1 + ^ 2 )] r = mr = p is the momentum of relative 
motion of the mass points. It involves the reduced mass m. 

We shall now show that the angular momentum of relative motion 
does not depend on the choice of the origin. Indeed, if we displace 
the origin, then all the quantities r\ change by the same amount: 
r- = r" + a. 

Accordingly, the angular momentum of relative motion will be 

N N N 

M' = 2 r i X Pi = 2 r i X Pi + 2 aXPi 

i=l i=l i=l 

N NX 

— 2 T i X Pi + a X 2 Pi = M" 

i=l i=l 

because 


N N 

2 pj= 2 0 

i=l i=l 

Thus, in calculating the angular momentum of relative motion the 
origin of the coordinates can be placed at any point. 

Conservation of Angular Momentum. We shall now show that the 
angular momentum of an isolated system is an integral of the motion. 
First take the total angular momentum of the system. Its time 
derivative is 

-^.=RXP + RXP=o 

because P = 0 for systems not subject to external forces, and 

RXP = 0 because R is directed along P. 

Let us prove that the angular momentum of the relative motion 
of the mass points in the system is also conserved. Its total derivative 
with respect to time is 

N N 

2 r iXpi+ 2 r <Xpi ( 4 - 23 ) 

,i=i i=i 

The first term in the right-hand side of (4.23) vanishes because r{ is 
directed along pj. Let us consider the second term. Recalling that 
the potential energy V depends on the absolute values of the differ- 
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ences of all the coordinates, U = U (. . . | r t — r h | . . .), from 
the equations of motion we can write, with the help of formulas (3.1) 
and (3.2) 


foi _ VI dU Tjk 

dt Zj d\ rjfe | | Tik | 

feM 

Here in the right-hand side we have the sum of all the forces with 
which all the other points act on the ith point. We derive the vector 
product of this equation multiplied by r\ and sum over all the mass 
points. On the left we have the time derivative of the angular mo¬ 
mentum of the relative motion of the system, and on the right, 
the double sum over all pairs. The partial derivatives dUld | r ih | 
are involved in this sum twice: of the ith and /rth mass points. 
They are multiplied by the vectors r f X ( r ; T k) an d T k X ( r & — r i)> 
which in the sum vanish because r* X r i = 0» X r fe — 0* 
r t X r ft = —r fe X r i* Thus dM'/dt = 0, so that the angular mo¬ 
mentum of the relative motion of the mass points in a closed system 
is conserved. 

In the next section we shall show that angular momentum (or its 
components) may be conserved in an external field as well, provided 
the field possesses the required symmetry. 

Additive Integrals of Motion for a Closed System. We have shown 
that a closed mechanical system has the following first integrals of 
motion: energy, three components of the momentum vector, and 
three components of the angular momentum vector. Linear and 
angular momenta are always additive, and energy is additive only 
for the noninteracting parts of the system. 

The mechanical energy referred to macroscopic degrees of freedom 
of a body as a whole is in very many cases not conserved. In the 
presence of friction forces it is transferred in the form of heat to the 
microscopic (molecular) degrees of freedom. Linear and angular 
momenta are always conserved in a closed mechanical system. 
The former is associated with the motion of the system’s centre of 
mass, the latter with its rotation about the centre of mass. Both 
these integrals of the motion belong to the macroscopic (mechanical) 
degrees of freedom. 

It is much more difficult to obtain all other mechanical integrals 
of motion (with the exception of the centre-of-mass integral), and 
no general rule for determining them can be formulated. 

The seven additive integrals of motion—energy, linear and angular 
momenta (seven because the latter two are vectors)—are special 
cases in the sense that they owe their existence to symmetry with 
respect to translations and rotations. Indeed, the symmetry of the 
Lagrangian with respect to displacements in time leads to the energy 
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conservation law. Symmetry with respect to spatial displacements 
imposes the restriction on the potential energy that it depend only on 
the differences between the particles’ coordinates. Thanks to this 
the motion of the centre of mass is separated, and the total momen¬ 
tum of the system is conserved. We deduced the conservation of the 
angular momentum of relative motion from the fact that the poten¬ 
tial energy involves the absolute values of the differences | r t - — r* |, 
which is in agreement with the property of space isotropy. From this 
property can be deduced the conservation of angular momentum in 
the case of less restrictive assumptions regarding U (Sec. 5). 


EXERCISES 


1. Describe the motion of a mass point moving along a cycloid in a 
gravitational field. 

Solution . The equation of the cycloid in parametric form is 
z = —R cos s, x = Rs + R sin s 
The kinetic energy of the point is 

779 • • C * 

T = -y- (x* + z2) = 2mi?a COS*-j 

The potential energy is 
U = —mgR cos s 

The total-energy integral is 

s • 

E = 2 mR 2 cos a — s* — mgR cos s«= constant 
& 


The value of E can be determined on the condition that the velocity .« 
is equal to zero when the deflection is maximum, s = s 0 ; the mass point 
moves along the cycloid from that position. Hence 
E = —mgR cos s 0 


After separating the variables and integrating, 

,-YIr f f 

J (gR cos s + E/m) 1 ^ \ g J J 


we uuiaiu 


cos (s/2) ds 


(cos s — cos i 


xl/2 


Denoting sin (s/2) = u, we integrate and put the limits to get 


//?\i/2r 2 du _ g /*\ 

V g ) ) (u*- m *)1/2 V g ) 


1/2 


arc sm 


u 0 


In order to find the total period of the motion, we must take the integral 
between the limits — u 0 and +u 0 an d double the result. This corresponds to 
the oscillation of the mass point from s = -s 0 to s = s 0 , and back to s =- 

= —*«• 
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Thus the total period of oscillation is equal to 2 jt( 7?/#) 1 / 2 . ng as 
the mass point moves on the cycloid, the period of its oscillation does not 
depend on the oscillation amplitude s 0 (Huygens’ cycloidal pendulum). The 
period of oscillation of an ordinary pendulum, which describes an arc of 
a circle, is known, in the general case, to depend on the amplitude (cf. (4.7))* 
2. Prove that a point moving along a curved line in a vertical plane 
descends from a given upper position to a given lower position in the shortest 
time if the curve is a cycloid. 

Solution . Using the results obtained in Exercise 2, Section 3, we write 
the energy integral for the point descending along the curve: 

Hence the time of descent is given by the quadrature 




da(x’*+t’*)' 12 
[* (s 0 )—*(*)]* /2 


Passing in the integral from the independent variable s to the variable z- 
we obtain 


. -1/2 f dz\i4-(dxldz)*] i/2 

J (z 0 —z) 1/2 

The dependence of x on z must be so defined as to assure that t has an 
extremum. We*considered a similar problem in connection with Hamilton s 
principle, where we had to find the path corresponding to the minimum value 
of the integral S . For this the integrand must satisfy the Lagrange equation. 
Such an equation can, obviously, he written for solving the present problem, 
taking z as the independent variable, x as the' dependent variable, and the 
integrand taken from the expression for t. Since the dependent variable is 
not explicitly involved in the integrand, it is cyclic, and the corresponding 
“momentum”, that is, the derivative of the integrand with respect to dz % te 
constant. Denoting it —(2R)~ 1 / 2 1 we have 


dx r / dx \ 2-I-1/2 / z n -z \ 1/2 

^rl. 1 + (-dr)J —(—) 


Introducing new variables 

z = z 0 — R (1 + cos s), dz = R sin s ds 

yields 


dx_ 

dz 


— cot 


s 

~2 


Passing from x to s, w r e easily find 

dx = R (i + cos s) ds , x = R (s -j- sin s) 

All that is needed to obtain the cycloid equation written in the previous 
exercise is to put x 0 = R. 
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From Exercise 1 the total time in]which the particle descends from the 
upper to the lower position is equal to (jt/2) ( R/g )V 2 . In free fall from the 
height 2 R, equal to the ascent of the cycloid, the time is 2 (Rig) 1 / 2 . 

3. Show that in rotations of the coordinates the vector product defined 
by formulas (4.19) transforms as a vector. 

Solution. The transformation formulas for the components of a vector 
in a rotation of a coordinate system are known from analytic geometry. We 
write these formulas in compact form. Number the coordinates x = x u 
y = x 2 , z = x 3 and denote the cosine of the angle between the new axis a 
and the old axis p as (a, (5). Then the required form of the formulas is (taking 
the sum with respect to the index that appears twice) 

*a = («. P).*0 

In view of symmetry between the old and new coordinates the inverse 
transformation is written as follows: 

x fi — x ' a ( a ' P) or x a = a) 

The determinant of the transformation is unity. This is proved in the 
following way. As is known from analysis, in a transformation of coordinates 
the volume element is multiplied by the Jacobian determinant 

d (x[, x' 2 , x 3 ) 

d (*!, x 2 , x 3 ) 

In this case the Jacobian coincides with the transformation determinant, 
because it is linear. On the other hand, a rotation cannot affect the volume, 
so that the assertion is proved. 

Now we construct the determinant of the inverse transformation in 
general form. It is known that the coefficient at the intersection of row a 
and column p in an inverse transformation is the minor of the corresponding 
element in the direct transformation divided by the determinant, in the 
present case by unity. At the same time, the inverse transformation is carried 
out with the help of the same coefficients but with the indices interchanged. 
We now write the determinant in explicit form: 

(1.1) (1,2) (1,3) 

(2.1) (2,2) (2,3) 

(3.1) (3,2) (3,3) 

Equating the elements of the first row with the interchanged indices to the 
corresponding minors, we obtain 

(1.1) = (2,2) (3,3) — (2,3) (3,2) 

(2.1) = (2,3) (3,1) — (2,1) (3,3) 

(3.1) = (2,1) (3,2)-(3,1) (2,2) 

Now take three unit vectors directed along the axes of a right-handed 
coordinate system (rotation of the corkscrew handle from x to y displaces 
it along the z axis). From the definition of the vector product the following 
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equation must hold: 

i= j Xk 

where the unit vector i is directed along x , j is along y, and k is along z* 
For the equation actually to occur it should be satisfied not only in the 
initial coordinate system. Projecting the unit vectors on new axes, we see 
that the obtained relationships for the cosines of the angles between the old 
and new axes assure the validity of the relationship between vectors i, j, k 
in all systems. 

Since any two vectors can be resolved along unit vectors and a vector 
product is distributive, it always yields a vector. Note that this is in a sense 
a “fortuitous” property of vectors in three-dimensional space, insofar as there 
happen to be three quantities of the type found in the right-hand sides of 
(4.19). 


5 


MOTION IN A CENTRAL FIELD 


The"Angular Momentum Integral. We shall consider the motion of 
two bodies (particles) in a frame of reference fixed with respect to 
their centre of mass. In such a reference frame angular momentum is 
associated only with the relative motion of the bodies. If there are 
no external forces the angular momentum is conserved. We denote 
the separation of'the particles r = r x — r 2 ; the corresponding linear 
momentum is 


P 


ATI 1^2 

m i “h m 2 


r = m\ 


(5.1) 


This relationship is obtained from (3.17) by differentiation of the 

particles’ Lagrangian with respect to r = v, Whence in accordance 
with (4.22) we find 


M = r X P = mr X r = constant (5.2) 

If the vector is constant, all three components are constant. Not 
only its absolute value but its spatial direction as well does not 
change. But from (5.2) the vector r of the relative position of the 

particles and the vector r of their relative velocity are perpendicular 
to the constant direction of M. And since all perpendiculars to the 
same point of a line lie in one plane, the relative motion of the 
particles of the system takes place in that plane. It can be seen from 


4-0452 
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(5.2) that vector r lies in the same plane as the path, which is a 
plane curve. 

When transforming to a spherical coordinate system, it is advisable 
to direct the polar axis along z. Then the motion takes place in the 

x,y- plane, or 'O' = n/2, sin O' = 1, 0 = 0. 

The potential energy of a system of two mass points depends only 
on the distance r between them, because it is the only scalar quan¬ 
tity that can be derived from vector r. From (3.11) the Lagrangian 

for plane motion at 0 = 0, sinO = 1 is 

L = (r 2 + r 2 <p 2 ) — U (r) (5.3) 

where m is the reduced mass. 

If one mass is much greater than the other, it lies very close to the 
centre of mass of the system and its motion can be neglected. In that 
case the light particle (or system of light particles) moves around the 
heavy one like a planet in the solar system. The angular momentum 
conservation law holds, but only with respect to the central body 
and not to an arbitrary point in space. The conclusion that the paths 
of the moving particles are plane also holds, but only provided 
their interactions can be neglected or that they were moving in the 
same plane from the outset. 

Angular Momentum as Generalized Momentum. We shall now 
show that the angular momentum component M ZJ which in the 
adopted coordinate system is equal simply to the absolute value Af, 

is nothing other than = dL/dy, that is, the generalized linear 
momentum corresponding to the coordinate <p, which is the angle 
of rotation about the z axis. Indeed, from (4.19), (3.86), and (3.8c), 
the angular momentum M = M z is 

M =-= M z = xp y — yp x — mr cos ep (r sin cp + r cos <p <p) 

— mr sin <p (r cos <p — r sin <p <p) 

= mr 2 (cos 2 <p + sin 2 <p) <p = mr 2 <p 


On the other hand, differentiating L with respect to cp we see that 
P<p = = ™r 2 <P ( 5 -4) 

d(p 

The quantity r 2 <p/2 is known as the areal velocity , that is, the area 
swept out by the radius vector of a mass point in unit time. Indeed, 

rep is the base of a triangle whose vertex is at the origin, and r is 
its altitude. The difference between the areas of this triangle and 
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the triangle produced by the displacement of the particle, which 
need not be perpendicular to the radius, is an infinitesimal of the 
second order. Thus, in geometrical interpretation the angular mo¬ 
mentum conservation law expresses the constancy of the areal 
velocity in the orbital motion of a material point in a field of central 
forces ( Kepler's Second Law). 

The relationship M z = offers a new explanation of the conserva¬ 
tion of angular momentum of a closed system. Indeed, the Lagrangian 
of such a system cannot change in a rotation of the frame of reference 
through an arbitrary angle about an arbitrary axis in space. But then 
the angle of rotation is a cyclic coordinate, and the corresponding 
generalized momentum is conserved. And since the angle of rota¬ 
tion is arbitrary, all three angular momentum components of the 
closed system must be conserved. It can be seen from this reasoning 
that the law of conservation of angular momentum holds not only 
when the forces between points act along the lines joining them 
(as was assumed in Section 4) but in the most general case as well. 
Direct proof of this on the basis of the equations of motion is rather 
cumbersome. 

Elimination of the Azimuthal Velocity Component. The angular 
momentum integral permits us to reduce the problem of two-particle 
motion, or the problem of motion of a single particle in a central 

field, to quadrature. To do this we must express <p in terms of angular 
momentum and thus get rid of the superfluous variable, since angle q> 
itself does not appear in the Lagrangian. In this fashion every cyclic 
variable can be eliminated. 

In accordance with (4.4), we first of all have the energy integral 
E = ^fr + j*&) + U(r) (5.5) 


Eliminating <p with the aid of (5.4), we obtain 

E = + U(r) = constant (5.6) 

This first-order differential equation (in r) is later reduced to 
quadrature. Before writing down the quadrature, let us examine it 
graphically. 

The Dependence of the Form of Path on the Sign of the Energy. 
For such an examination, we must make certain assumptions about 
the variation of potential energy. 

From (2.1), force is connected with potential energy by the relation 


dU 
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The upper limit in the integral can be chosen arbitrarily. If F tends 

oo 

to zero at infinite r faster than 1/r, then the integral f F dr is con¬ 


vergent. Then we can put U(r) = j F dr, or [/(oo) = 0. In other 

r 

words, the potential energy is chosen to be zero at infinity. 

Furthermore, we assume that U(r) does not increase faster than 
1/r at r 0, as, for example, in Newtonian attraction, where 

oo 

U = — j (a/r 2 ) dr = — air. 

r 

Let us now write (5.6) as 


mr 2 

~ 


E 


M 2 

2mr 2 


U(r) 


(5.7) 


The left-hand side of this equation is always positive. At r oo 
the last two terms tend to zero. Thus for the particles to be able 
to recede from each other to an infinite distance, the total energy 
must be positive when the potential energy satisfies the condition 
[/(oo) = 0. Hence, if two particles come together from infinity, their 
energy must, according to the conservation law, be positive. If, as 
the two particles draw closer, the energy is not transferred to a third 
particle, on meeting they will of necessity separate again to an 
infinite distance. 

Given a definite form of [/, we can plot the curve of the function 
U M (r)^-^ T + U(r) (5.8) 


The index “M” denotes that the potential energy includes the “centri¬ 
fugal” energy M 2 /(2mr 2 ). The derivative of this quantity with re¬ 
spect to r, taken with the opposite sign, is equal to AP/imr 3 ). If we 

put M = mr a <p, the result will be the usual expression for the centri¬ 
fugal force. However, in future we shall call a mechanical quantity 
of different origin the “centrifugal force” (Sec. 8 ). 

Let U(r) <0 and increase monotonically as r changes from zero 

to infinity. It follows that the force has a negative sign ( since F = 
= — — J, that is, it is an attractive force. Let us assume, in 

M 2 

addition, that at infinity | U(r) \ > ^ 72 ) • This is true, for 

example, of Newtonian gravitational forces or for electrostatic 
Coulomb forces between charged bodies. 
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Let us summarize the assumptions we have made regarding 
U M (r): 

(i) at r ->0 the “centrifugal” term isjpredominant, hence U M (r) 
is£ infinite and positive; 

(ii) at r -^oo, where | U(r) | >> M 2 /(2mr 2 ), U M (r) tends to zero 
from the side of negative values. 

Consequently, the curve U M (r) has the form shown in Figure 5. 
On the side of small values of r it decreases away from zero as 1/r 2 ; 



as it approaches large values of r it increases as —a/r, approaching 
the r axis from below. The curve must have a minimum in the domain 
of median values of r. 

In the interaction of a charged body with a neutral one (for 
instance, with an atom which has not lost a single electron) U(r) 
decreases faster than the “centrifugal” energy. Therefore at large 
distances U M (r) approaches the r axis from the positive side. Then 
the curve U M (r) first passes through a minimum, then increases, 
and after passing a maximum decreases again, tending to zero as 
r ->oo. This holds if there is a domain where U(r) predominates 
over the “centrifugal” energy. Otherwise the decrease is monotonic. 

The total energy of a system of converging particles can also be 
plotted in Figure 5. Since E is conserved in motion, the curve has 
the form of a horizontal straight line lying above or below the z axis, 
depending on the sign of E. For positive values of energy, the line 
E = constant lies above the curve U M (r) everywhere to the right 
of point A . In this case the difference E — U M (r) is positive. The 
particles can approach each other from infinity and recede from each 
other to infinity. Such motion is termed infinite. As we shall see 
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later in this section, in the case of Newtonian attraction we obtain 
hyperbolic orbits. 

At E <0, but higher than the minimum of the curve U M (r), 

the difference E — U M (r) = mr 2 l2 remains positive only between 
points B and B '. Consequently, between the corresponding two 
values of the radius lies a physically possible domain of motion with 
the given negative total energy. The motion is in this case termed 
finite . In the case of Newtonian attraction elliptical orbits correspond 
to it. In the motion of a planet about the sun point B corresponds 
to the perihelion, point B' to the aphelion. 

At E = 0 the motion is infinite. As r increases the velocity tends 
to zero, remaining positive. In the case of Newtonian attraction 
parabolic orbits correspond to this value of E . 

Falling Onto the Centre. From the preceding reasoning it is appa¬ 
rent that r cannot decrease to zero owing to “centrifugal” energy. 
Only if the particles are “targeted” on each other does the arm vanish, 
so that M = 0, and the curve U M ( r) is replaced by the curve U (r). 
Then nothing prevents the particles from colliding. 

Let us now investigate an imaginary rather than a real case, when 
— U (r) tends to infinity as r -^0 faster than 1/r 2 . In this case U M ( r ) 

is negative for all r’ s close to zero. From (5.7), r 2 is positive at in¬ 
finitesimal values of r and tends to infinity as r —>-0. But the particle 
cannot collide head-on because such a collision would violate the law 
of conservation of angular momentum. The angular momentum is 
equal to mpu^, where p is the arm. For M to retain its given finite 
value when the arm decreases to infinitesimal values the velocity 
component v^ perpendicular to the radius must tend to infinity as 
1/p. Then the product mpr^, which defines the angular momentum, 
remains finite. 

Thus, if U M = —oo at r = 0, the radial component of the velocity 
tends to zero, and the azimuthal component tends to infinity. The 
path of the particle has the form of a helix winding around the 
attracting centre, but never reaching it. The coils of the helix de¬ 
crease, but the rotation speed increases. The force of “centrifugal” 
repulsion cannot prevent the particles from drawing gradually closer, 
which takes place the slower the smaller r is. 

In the motion of three bodies gravitating towards one another 
according to Newton’s law, two of them may collide even if at the 
initial time their motion was not purely radial. Indeed, only the 
total angular momentum of relative motion is conserved, and this 
does not preclude the collision of the two bodies. 

Finally, for repulsive forces tending to infinity as r -*-0, falling 
onto the centre is impossible. Obviously in this case the motion is 
only infinite. 
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Reducing to Quadrature. Let us now find the equation of thejpath 
in general form. To do this we must in (5.7) change from differen¬ 
tiation with respect to time to differentiation with respect to cp. 
Using (5.4) we have 


Separating the variables and passing to <p in (5.7) yields 


r 



tq 


(5.10) 


As can be seen from the equation, angle <p is put equal to zero 
at r = r 0 . We assume that this value of r corresponds to point A 
or B in Figure 5, that is, to the “perihelion”. At these points r ceases 

to decrease, that is, achieves the minimum, and r = 0. Point r = r 0 
is determined from Eq. (5.7): 

£ ”T=T+ t, < r " ) < 511 > 

Kepler’s Problem. Equation (5.10) offers a complete solution of 
the problem of the motion of a mass point in a central field. The 
answer in quadrature form involves the initial data. If they are 
known, integration can be carried out in one way or another. The 
fact that the integral sometimes cannot be solved in terms of ele¬ 
mentary functions is not essential. 

But, of course, if the solution is obtained in the form of a known 
and thoroughly studied function, it is of special interest, because 
it readily lends itself to graphic investigation. 

A simple solution in elementary functions can be found only in 
a few cases. One of them is that of central forces decreasing inversely 
as the square of the distance. This is the law governing the forces 
of Newtonian attraction between mass points (or bodies possessing 
spherical symmetry). 

The problem on the motion of two such bodies is known as Kepler's 
problem , since it was Kepler who empirically established the laws 
for this case from available data on the motions of the planets across 
the sky. Newton later developed Kepler’s laws theoretically from 
the equations of mechanics and the law of gravitation as a supple¬ 
mentary hypothesis regarding forces of interaction. From thence the 
systematic development of exact natural science began. 

Today the term “Kepler’s problem” is applied to any forces in¬ 
versely proportional to the square of the distance between two mov¬ 
ing points, regardless of their nature or sign. Thus, the Coulomb 
interaction also falls within the scope of Kepler’s problem. We as¬ 
sume the constant in the force law F = air 2 and in U(r) = air to be 



56 


Fundamental laws 


negative or positive, depending upon whether the investigated 
particles are mutually attracted or repulsed. 

If we replace M/(mr) in (5.10) by a new variable x , the integral 
in Kepler’s problem is reduced to the form 


X 



*0 

xA-a/M x=M/(mr) 

= arc cos - ! — -rrr 

(a 2 /M 2 -)-2E/m) ^ xo=M/(mro) 

The angle is measured counterclockwise from the perihelion (see 
Figure 6 on p. 64). 

Substitution of the lower integration limit yields zero. This is 
seen both from (5.10) according to the choice of the origin for (p, 
and directly from the equation. Indeed, when the radicand vanishes, 
we have unity under the arc cos sign, and arc cos 1=0. 

Inverting the integration result and turning to the variable r, 
we obtain after some simple transformations 

M 2 r , . M ( a 2 . 2E \ 1/2 1-1 /K An x 

r =^rL- 1 + “(i^ + ~) C0S(p J < 5 - 12a) 


This formula is valid for both signs of the force constant a. Let 
us write it for each sign separately. First let a > 0, which corresponds 
to repulsive forces. Then, after a slight transformation of (5.12a), 
we obtain for the path of a point in a force field the equation of 
a hyperbola 


■—r(*+ 

am L V 


2 EM 2 \ 1/2 


(5.126) 


Its eccentricity is [1 + 2 EM 2 /(ma 2 )] l/2 and consequently greater 
than unity. At <p = 0 the denominator of the fraction has its greatest 
value, and r its smallest (the perihelion). But as <p increases cos q> 
decreases, and at some point the denominator vanishes, while r 
tends to infinity. The corresponding angle cp 0 gives the direction of 
the asymptote to the hyperbola (see Figure 6 ). Greater values of q> 
are meaningless since the radius vector of the mass point correspond¬ 
ing to them would be negative. 

For forces of attraction, a <0, and from (5.12) we have 


n*+ 


2EM* \ i/2 


<p + l]' 


(5.12c) 


Now the path may be of two forms, depending on the sign of the 
energy. At E > 0 the eccentricity of the curve is again greater than 
unity, and we have a hyperbolic trajectory. But since we now have 
a plus sign in the denominator the angle <p at which the path extends 
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into infinity is greater than jt/2. This means that a particle approach¬ 
ing from infinity is deflected from the centre of attraction and skirts 
it. It encloses the focus of the curve within the asymptotes, while the 
hyperbola in Figure 6, drawn for the case of repulsive forces, corres¬ 
ponds to the focus being outside the angle between the asymptotes. 
This circumstance is quite obvious. 

If E < 0, the eccentricity is less than unity. Since cos <p is also 
always smaller than unity, the denominator in (5.12c) never vanishes, 
and we obtain the equation of an ellipse; cp = 0 corresponds to the 
perihelion, and cp = n to the aphelion. 


EXERCISES 

1. A mass point m is travelling towards an attraction centre for which 
the potential energy expression is — | a |/r a . The velocity of the particle 
at infinity is given in magnitude and direction. A straight line is drawn 
from the centre parallel to that direction. The distance between the line 
and the path at infinity is p. Determine the value of p at which the paths 
receding again into infinity and the paths spiraling towards the centre 
separate. 

2. Obtain the path equation for the case of U = | a | r 2 /2, E > 0. 

Answer. A circle or an ellipse; unlike the case of Kepler’s problem, the 

centre of attraction lies at the centre of the orbit. 

3. Prove that Kepler’s problem involves a supplementary integral 
of the motion expressed in the form of the vector 

N = v X M- ~ r 

Solution. Differentiate N with respect to time and then substitute 
mr X ▼ for M, and —ar/r 8 for my to get 
N = vX M—£-▼+-£■ r(r-v) 

= —pr rX(rXv) - 7T [r 2 v-r (r-v)] = 0 

Vector N is directed from the focus to the perihelion and is numerically 
equal to | a | <?, where e is the eccentricity of the orbit, e = [1 — 2 | E \ X 
X M 2 l{ma 2 )] 1 / 2 . 

The fact that vector N is constant is closely connected with the form 
of the force law in Kepler’s problem, where the path has the form of an 
ellipse fixed in space. For any other dependence of the potential energy on 
distance (except for the case given in Exercise 2) the integral (5.10) computed 
between the two positions of minimum approach of the attracting particles 
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is not a simple multiple of 2 jx (or of Jt f as in Exercise 2). At least, arbitrary 
values of the integrals E and M do not yield a simple multiple. But this 
means that in this case the path does not have the form of a closed curve, 
that is, the “perihelion” rotates in space. Accordingly, the path takes the 
form of a “rosette”. The result is also a rosette when Kepler’s problem is 
solved according to the laws deriving from relativity theory (see Exercise 9, 
Section 14). 


6 


COLLISION OF PARTICLES 

The Significance of Collision Problems. In order to determine the 
forces acting between particles, it is necessary to study the motion 
of the particles caused by these forces. In this way Newton’s gravita¬ 
tional law was established with the aid of Kepler’s laws. Here, the 
forces were determined from finite motion. However, infinite motion 
can also be used if one particle can in some way be accelerated to 
a definite velocity and then made to pass close to another particle. 
Such a process is termed “collision” of particles. It is not at all 
assumed, however, tl^at the particles actually come into contact 
in the sense of “collision” in everyday life. 

Neither is it necessary that the incident particle should be arti¬ 
ficially accelerated in a machine: it may be obtained in ejection from 
a radioactive nucleus or as the result of a nuclear reaction, or it may 
be a fast particle in cosmic radiation. 

Two approaches are possible to problems on particle collisions. 

Firstly, it may be only the velocities of the particles long before 
the collision (before they begin to interact) that are given, and the 
problem is to determine their velocities (magnitude and direction) 
after they have ceased to interact, that is, when they have receded 
to an infinite distance. To solve this kind of problem it is necessary 
to state some quantities characterizing the collision, for example, 
the change in energy of the colliding particle or its angle of deflec¬ 
tion. Then all the other quantities can be determined with the help 
of conservation laws. Thus, it is the result of the collision that is 
summed up, without going into the details of its course. 

However, another approach is possible: it is required to precal¬ 
culate the final state where the precise initial state is given. 

We shall first consider collisions by the first method. It is clear 
why, if only the initial velocities of the particles are known, the 
collision is not completely determined: it is not known at what dis¬ 
tance the particles pass by each other, since we do not know their 



Mechanics 


59 


initial positions. Therefore some quantity relating to the final 
state of the system must be given. Usually the problem is stated as 
follows: the initial velocities of the colliding particles and also the 
direction of velocity of one of them after the collision are specified. 
It is required to determine all the remaining quantities after the 
collision. 

If the total kinetic energy of the particles is the same before and 
after the collision or if its change as a result of the collision is stated 
exactly, the problem has one solution. There are six unknown quan¬ 
tities: the six angular momentum components of both particles. 
The conservation laws give four equations: one corresponding to the 
conservation of the energy (taking into account possible dissipation 
if the collision is inelastic), and three expressing the conservation 
of the vector quantity of total linear momentum. 

A collision is inelastic when a portion of the kinetic energy of the 
colliding particles is transferred to the internal degrees of freedom. 
In that case the energy balance must include the portion “trapped” by 
the internal degrees of freedom, and it must be stated in advance. 
If the kinetic energy of the colliding particles does not change, the 
collision is called elastic. 

One of the quantities characterizing the state of the particles 
after the collision is usually of no interest: the plane in which the 
linear momenta of the receding particles lie. They can be arbitrarily 
stated to be flying apart in, for example, the plane of a diagram in 
which both final linear momenta are depicted. Thus for the six 
required quantities we have four equations and one arbitrary plane. 
It is therefore necessary to state one more quantity characterizing the 
collision, for instance, the angle of deflection of the incident particle. 

There exists a more general type of inelastic collisions in which 
not only the internal energy of the particles changes but their nature 
as well. That is what occurs in nuclear reactions. It is then necessary 
to state the masses of the resultant particles. But strictly speaking 
this case cannot be treated in the framework of Newtonian mechanics, 
since it is necessary to take into account the equivalence of mass and 
energy in accordance with the mechanics of Einstein’s relativity 
theory (see Part II). We shall consider such collisions for the case 
when the change in total mass can be neglected. 

The Laboratory and Centre-of-Mass Frames of Reference. When 
collisions are observed in laboratory conditions, one of the particles — 
belonging to the target—is usually at rest prior to the collision. 
A frame of reference fixed with respect to the target (or the labora¬ 
tory) is called the laboratory frame of reference. In it the colliding 
particles have a total linear momentum equal to the momentum of 
the incident particle; the linear momentum of the second particle 
in this system is, by definition, zero. In accordance with the law of 
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conservation of linear momentum, the scattering particles must 
possess the same total momentum. 

It follows from this that both these particles also possess the 
kinetic energy of the common motion of their centre of mass, which 
is given by formula (3.17); namely, the first term in the right-hand 
side. Thus, part of the kinetic energy of the incident particle prior 
to the collision must necessarily be imparted to the centre of mass 
of the particles following the collision. This portion of the energy 
cannot be usefully expended if a transmutation act requiring an 
expenditure of energy results from the collision. That is why it is 
common to employ a frame of reference fixed with respect to the 
centre of mass of the particles. 

In this system the total linear momentum of the colliding particles 
is zero; initially they are moving towards one another, after the 
collision they are receding in strictly opposite directions; in the 
most general case, at some angle to the initial direction of the linear 
momenta. In the centre-of-mass reference frame the total energy of 
the particles prior to the collision may be expended on a transmuta¬ 
tion. Obviously, in this case the collision is inelastic to the highest 
degree. 

Differentiation of formula (3.12) with respect to time yields the 
velocity of the centre-of-mass frame relative to the laboratory frame: 


V = 


TYl\ 

Til i -|- TTl*} 


V 0 


( 6 . 1 ) 


Here, v 0 is the velocity of the first particle relative to the second, 
m x is the mass of the first particle, m 2 is the mass of the second par¬ 
ticle, which was at rest prior to the collision, V is the required 
velocity of the centre-of-mass frame relative to the laboratory frame. 
Apparently, the quantity v 0 , that is, the relative velocity of the 
particles, is the same in both reference frames. This holds as long 
as the simple law of velocity composition can be applied, that is, 
as long as the velocities are small in comparison with the speed of 
light. 


The General Case of an Inelastic Collision. The velocity of the first 
particle relative to the centre-of-mass frame is, according to the law 
of addition of velocities, 


v 10 = v 0 V - 


jn 2 


m l + m 2 ° 

and in the same reference frame, the velocity of the second particle is 

mi 


v 20 =-V = 


mi + m 2 


V 0 


( 6 . 2 ) 
cle is 
(6.3) 


Thus, m 1 \ 1 o + m 2 y 2 o = 0, as it should be in this reference frame* 
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In accordance with (3.17), the energy in the centre-of-mass 
frame is 

jp _ 1U\fTl2 2 __ ^QPq 

0 2 (mi + m 2 ) 0 2 

Here, the reduced mass is indicated by a zero subscript because 
in nuclear reactions it may change. Without invoking considerations 
of mass-energy equivalence, let us simply agree that as a result 
of the reaction a certain energy Q is released or absorbed, that is, 
energy transforms from internal energy (corresponding to internal 
degrees of freedom) into the kinetic energy of the particles (Q > 0) 
or, conversely, it transforms from kinetic energy into the energy of 
the internal state (Q <0). 

In the case of nuclear reactions Q is the energy associated with 
rearrangement in the system. The same is true of chemical reactions, 
but here it should be noted that two colliding atoms cannot turn 
into two other atoms, while colliding molecules cannot even be 
regarded as points (their structure is important). Collisions between 
neutral atoms may involve changes in their internal energy states, 
in which case the formulas derived here are also applicable. 

Taking into account the energy Q, the energy conservation law in 
a collision must be written thus: 


(6.4) 


m 0 u jj | q __ mv 2 
2 2 


(6.5) 


Here, m = + m A ) is the reduced mass of the particles 

produced in the collision, and v is their relative velocity. 

In order to specify the collision completely, we consider that the 
direction of v is known: its absolute value is given by (6.5). The 
velocities of each particle separately are 


y «=-:^r v 


( 6 . 6 ) 


They satisfy the requirement m 3 \ 3Q + ra 4 v 40 = 0, that is, the law 
of conservation of momentum in the centre-of-mass reference frame. 
It is easy to verify that the energy conservation law also holds, 
since 

| _ rnv 2 

2 ' 2 2 


Now it is not difficult to revert to the laboratory frame of ref¬ 
erence. The velocities of the particles in this frame are 


V3=V 30 + V 


m 4 v 

m 3 + 77i 4 


raiVp 
7711 -j- m 2 


V 4 = v 40 + V = 


m 3 \ 


nijV o 


(6.7) 
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In nuclear reactions the change in the total mass of the particles 
is no more than a fraction of a percentage point, so that in Eqs. (6.7) 
m x + m 2 can be substituted for m 3 + m A . Thus, knowing \/v, we 
obtain the complete solution of the problem. Usually the direction 
of the emerging particles depends upon the position of the detecting 
device relative to the target. If the energy is measured at the time 
of recording, the energy effect Q of the reaction can be computed. 


Elastic Collisions. The computations are simplified if the colli¬ 
sion is elastic, for then m 3 = m 1 , m 4 = m 2 , Q = 0. It follows from 
(6.5) that v 0 = r, so that the relative velocity changes only in 
direction and not in magnitude. Let us suppose that its angle of 
deflection % is given. We take the x axis along v 0 in the plane deter¬ 
mined by vectors v 0 and v. Then 


v x = v 0 cos x, v y = v 0 sin % 

From (6.6) the components of the particles’ velocity in the laboratory 
reference frame will, after collision, be correspondingly equal to 

{m { -\-m 2 cos %) v 0 _ m 2 v 0 sin % 

m i J r m 2 ’ lv l ° v 

m x (1—cos %) v o „ __ „ __ rrijU 0 sin % 


Vi* 


V2x : 


m i + m 2 


V 2 y V 2 Qy - 




( 6 . 8 ) 

(6.9) 


By means of these equations, the deflection angle 0 of the first 
particle in the laboratory frame of reference can be related to the 
deflection angle % in the centre-of-mass reference frame in the follow¬ 
ing manner: 


tan0=^= " 2SinX 

v ix m l -\-m 2 cosx 


( 6 . 10 ) 


The “recoil” angle of the second particle, which before collision 
was fixed in the laboratory frame of reference, is given by the formula 


tan 0'= _^L = stajL-^coti, 0'=^- (6.11) 

Angle 0' is taken with respect to the x axis; the minus in the defini¬ 
tion of tan 0' is chosen because the signs of v ly and v 2y are opposite. 

Equation (6.10) becomes still simpler if the masses of the col¬ 
liding particles are equal. This is true if the two particles are, say, 
protons, and is approximately true if one is a proton and the other 
a neutron. Then, from (6.10), 


tan 0 = tan B 0 = y 


0 ' 


n — % 


@ + ©' = -y 


2 » 


( 6 . 12 ) 
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so that the particles fly apart at right angles, while the deflection 
angle of a neutron in the laboratory frame is half its deflection angle 
in the centre-of-mass frame. Since the latter varies from 0 to 180°, 
© never exceeds 90°. This is obvious without computations, because 
the masses of the particles are equal. In a head-on collision the 
incident particle comes to a halt while the resting particle is pro¬ 
pelled straight ahead with the speed of the incident particle. This 
is because in a head-on collision in the centre-of-mass frame particles 
of equal mass simply exchange velocities. In the laboratory frame 
the second particle thus receives the initial velocity of the first, 
which comes to a halt, as was asserted. 

We shall now determine the kinetic energy transferred to the 
second particle in the collision, starting with the case of different 
masses. From Eqs. (6.9) we find 


E, 


m 2 

2 


(v 2 


2m 


+ v 2 


2V‘ 


)= 


m 2 m 2 (1 —cos x) v 2 
(ni + m 2 ) 2 


Relative to the energy E 0 of the incident particle, this is 
E 2 __ 2m i (1 — cos%) 

E o m^-\-m 2 


For particles of equal mass we obtain 


= sin 2 — = sin 2 0 

E o ^ 

Accordingly, for the first particle there remains 

= cos 2 © 

E o 

In a head-on collision, % = 180°, 0 = 90°, E x = 0, and E t = E 0j 
as was proved. 


The Particle Scattering Problem. We shall now consider the col¬ 
lision problem in greater detail, confining ourselves to the case of 
elastic collision, which is more conveniently investigated in the 
centre-of-mass frame of reference. The transformation to the labo¬ 
ratory frame according to Eqs. (6.7) is straightforward. 

Obviously, for a complete solution of the collision problem we 
must know the potential energy U(r) of the particles’ interaction 
and state the initial conditions in a way that would make it possible 
to determine all the integrals of the motion. The energy integral is 
easily found if we recall that at infinity U is gauged to zero: U( oo) = 
= 0. Denoting the relative velocity of particles at an infinite distance 
from each other by v, we obtain the value of the energy integral: 


E 


mu 2 


(6.14) 


2 
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Now we determine the angular momentum integral. Figure 6 
presents the motion of the first particle relative to the second for the 
case of repulsive forces. Its path at infinity from the second particle 
is a straight line, because the forces vanish there. Hence, the path 
of the particle has asymptotes both along the approach section 
(line AF) and along the receding section (line FB). The distance p of 
the asymptote AF from line OC drawn parallel to it from the second 



particle is known as the impact, or collision , parameter, it is nothing 
other than the “arm” when the particles are infinitely apart. It is 
apparent from this that the angular momentum is 

M = mv p (6.15) 

The mass in Eqs. (6.14) and (6.15) is the reduced mass. 

Knowing the energy and angular momentum integrals we can 
computeJthe deflection angle of the first particle. From Figure 6 
it can be seen that this angle is connected with the angle 2<p 0 between 
the asymptotes by the simple relationship % = n — 2(p 0 . In turn, 
cp 0 is calculated with the help of the quadrature according to 
Eq. (5.10), where the upper limit is taken equal to infinity: 

< Po = f - - - FT (6.16) 

J r2 [p* — i> 4 p a /r a — 2U (r)/m ] 1 ' 2 v 

T 0 

The angular momentum and energy integrals are already substituted 
here using (6.14) and (6.15). The lower limit is calculated from 
Eq. (5.11). 

The Differential Cross Section. Suppose the integral in (6.16) can 
be found. Then (p 0 and hence the deflection angle % are known as 
functions of the impact parameter. Let this function be inverted, 
that is, the impact parameter is obtained as a function of the de- 
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flection angle: 

P = P (X) (6.17) 

In collision experiments the impact parameter is actually never 
known in advance: a parallel beam of scattering particles having 
the same velocity is directed on a substance whose atoms or nuclei 
are the scatterers. The distribution of particles according to their 



deflection angles %, more precisely, according to the deflection an¬ 
gles 0 in the laboratory system, is observed. Thus, the scattering 
events are, so to say, observed at many successive times, with the 
most diverse impact parameters. 

Let one particle pass through one square centimetre of the scatter¬ 
ing substance. Then 2 np dp particles pass through an annulus con¬ 
tained between p and p + dp. We thus classify the collisions ac¬ 
cording to impact parameters much as is done on a shooting range 
with the help of targets with a pattern of concentric rings. If p is 
known as a function of %, it can be asserted that da = 2 jtp dp = 
= 2 np (dp/dy) d% particles will be deflected at an angle lying be¬ 
tween % and % + d%. The dependence of p on % * s determined from 
Eq. (6.17). 

Suppose the scattered particles are in some way detected at a 
large distance from the scattering medium. The latter can then be 
treated as a point and the scattered particles assumed to be travel¬ 
ling along rectilinear paths radiating from a common centre. Let us 
examine the particles moving in the space between two cones sharing 
a common apex and common axis. The direction of the axis coin¬ 
cides with the direction of the incident particles. The half-angle at 
the apex of the inner cone is equal to x> and th e half-angle of the 
outer cone is % + (Figure 7). 

5-0452 
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The space between the two cones is called a solid angle, by analogy 
with a plane angle, which is defined as the part of a plane between 
two straight lines. The measure of a plane angle is the arc of a circle 
of unit radius with its centre at the apex of the angle; the measure 
of a solid angle is the area of the sector of a sphere of unit radius 
drawn from the apex and contained within the cone. 

In Figure 7 the solid-angle element is represented as part of the 
surface of a sphere swept out by the arc element d% in a rotation 
about the radius OC. Since OC = 1, the radius of rotation of the 
element d% is equal to sin %, whence the area of the sphere swept 
out by it is 2jt sin % d%. Thus, the solid-angle element is 

dQ = 2jt sin % d% (6.18) 


In passing from the angle element d% to the elementary solid-angle 
element dQ we can write the expression for the number of particles 
scattered within the solid-angle element: 


do = p 


dp dQ 
d% s i n X 


(6.19) 


The quantity do has the dimensions of area. It is the area within 
which a particle must fall to be scattered within the elementary 
solid-angle element dQ. It is called the differential cross section of 
scattering into the solid-angle element dQ. 

In experiments it is this quantity that is determined in recording 
the particles deflected at different angles. If a unit volume of the 
scattering medium contains n scatterers, the attenuation of the 
primary parallel beam in passing through unit thickness of the 
medium due to deflection into the solid-angle element dQ is 


dJ = —Jn do= —Jnp ^ sin% P art i c l es /( cm " s ) 

In studying the dependence of do on % we find how the impact 
parameter depends on the deflection angle, which makes it possible 
to draw conclusions about the nature of the forces acting between 
the particle and the scattering centre. 


Rutherford’s Formula. The most important application V of 
Eq. (6.19) is in particle scattering in a Coulomb field. Assuming 
the scattering medium to have very large mass in comparison with 
the scattered charged particle, the laboratory frame of reference 
differs but slightly from the centre-of-mass frame, and the reduced 
mass closely approximates the mass of the light participant in the 
collision (see 3.21)). As was pointed out in Section 3, the Coulomb 
potential decreases with distance according to the law 1/r, like the 
Newtonian gravity potential. Consequently, the deflection angle 
can be computed from Eqs. (5.126) and (5.12c), depending on whether 
the forces acting between the particles are of attraction or repulsion. 
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Let the charge of the scatterer be Ze, and that of the scattered 
particle ±e. Then the force constant a is equal to ±Ze 2 . The col¬ 
liding particles are at infinite distance when the denominators in 
Eqs. (5.126) and (5.12c) vanish, that is, when 


cos <p 0 = ± ( 1 -+ 


2 M 2 E \-i/2 
ma 2 / 


. I 2M*E \ i/2 

0r taD< P°=(^-) 

( 6 . 20 ) 


Since only the absolute value of the deflection angle matters in 
determining the differential cross section, we take one sign for the 
tangent. Different signs of a in Figure 6 would correspond to de¬ 
flections of the particle up or down from the page, which in this 
case is irrelevant. 

Determining the integrals of the motion according to Eqs. (6.14) 
and (6.15), and recalling that % = n — 2cp 0 , we find the impact 
parameter as a function of the deflection angle: 



Since in the present case the centre-of-mass frame of reference 
is closely approximated by the laboratory frame, % can be replaced 
by 0, that is, the particle’s deflection angle in the laboratory frame. 
We now write the formula of the differential cross section in the 
laboratory frame according to the general definition (6.19): 


, _ Z 2 e* dQ 
** 4m 2 v* sin 4 (0/2) 


( 6 . 21 ) 


The number of particles scattered into the solid-angle element 
dQ = 2n sin 0 d@ is inversely proportional to the fourth power of 
the sine of one-half the deflection angle. This law is uniquely con¬ 
nected with the Coulomb character of the scattering forces. The 
greater the deflection angle the less the impact parameter. 

E. Rutherford traced the law (6.21) for the case of scattering of 
alpha particles on heavy nuclei up to very small impact parameters. 
At impact parameters of the order 8 X 10~ 13 cm, scattering becomes 
subject to a different law. Rutherford concluded from this that the 
whole positive charge of an atom is concentrated at the centre, 
since the diameter of an atom is around 10~ 8 cm. Practically the 
whole mass of the atom is also concentrated at the centre, otherwise 
alpha particles could not be scattered at large angles (sometimes 
their initial directions are almost reversed). Thus Rutherford’s 
experiments with alpha scattering led to the discovery of the atomic 
nucleus. 

If the colliding particles are of nearly the same mass, in Eq. (6.21) 
the deflection angle % i n the centre-of-mass reference frame should 
be substituted for angle 0, and the mass of the lighter participant 

f>* 
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substituted for the reduced mass, after which one can go over to the 
laboratory reference frame according to (6.10) and (6.11). 

The Rutherford formula undergoes a curious change in colli¬ 
sions of two identical particles, for example, in the scattering of 
an alpha particle on a helium nucleus (the particles are, as is known, 
the same). There is no method capable of distinguishing one alpha 
particle from another. It is impossible to determine whether a de¬ 
tected particle was at rest prior to the collision or impinged on a 
resting particle. Therefore, to obtain a formula suitable for compari¬ 
son with experiment it is necessary to take all the particles into 
account in the expression for da. 

It follows from Eqs. (6.12) that to transfer from the centre-of-mass 
frame to the laboratory frame in collisions of identical particles, 
X must be replaced by 29. Then sin % d% = 2 sin 29 d 9. It should 
also be taken into account that in the laboratory frame the particles 
scatter at right angles, so that sin 9' = cos 9. Substituting the 
reduced mass m 2 l(2m) = ml 2 for the actual mass m> we obtain the 
differential cross section for two identical particles interacting 
according to the Coulomb law: 

< 6 - 22 > 

Isotropic Scattering. From Eq. (6.21) it is apparent that scattering 
has a pronounced maximum for small deflection angles. This maxi¬ 
mum is associated with large impact parameters: particles passing 
each other at great distances deflect weakly, and great impact para¬ 
meters predominate in the expression for the differential cross sec¬ 
tion, since they involve a greater area. Therefore, if the forces of 
interaction between the particles do not identically vanish at finite 
distances, any particle, however far from the scatterer it may pass, 
is somewhat deflected. The ratio do/d£l then inevitably tends to 
infinity at small deflection angles. 

But if at large distances the force is not exactly zero but closely 
approaches it, that is, decreases rapidly, then the differential cross 
section begins to increase appreciably, tending to infinity only at 
the smallest deflection angles. But in experiments weakly deflected 
particles are not detected at all as having been deflected. Indeed, 
the initial beam already possesses a certain scattering, and deflection 
angles lying within this scattering angle cannot be detected. 

If the force decreases very rapidly with distance, the domain of 
the sharply increasing daldQ as a function of angle % may fall within 
such small angles that they cannot be experimentally detected as 
deflection angles, that is, separated from the initial beam. On the 
other hand, all strongly deflected particles are the more uniformly 
distributed as regards scattering angles the faster the forces diminish 
with distance. 
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This is shown in the example of particle scattering by an imper¬ 
meable sphere (Exercise 1). The acting force can be treated as the 
limiting case of a force centre repulsing particles according to the 
law V (r) = U Q (r 0 /r) n at n tending to infinity. If r <r 0 , then 
V (r) oo; and if r >> r 0 , then U (r) 0. In other words, at n ->- oo 

particles cannot penetrate the domain where r > r 0 , which corra- 
sponds to an impermeable sphere. But as can be seen from Exercise | 
at n = oo the scattering is completely isotropic. If n is not infinite 
but sufficiently large, the particles are distributed almost isotror 
pically over all angles, and a sharp maximum appears only at small 
deflection angles, receding into infinity when the deflection angle 
tends to zero. Consequently, a scattering law approximating the 
isotropic law is indicative of a rapid diminishing of the forces with 
distance. This circumstance played an important part in the inves¬ 
tigation of nuclear forces. 


EXERCISES 

1. Find the differential cross section for particles scattering on an 
impermeable sphere of radius r 0 . 

Solution. An impermeable sphere can be described in terms of mechanics 
by stating the potential energy in the form U(r) = 0 at r > r 0 (outside the 
sphere) and U(r) = oo at r ■<. r 0 (inside the sphere). Then whatever a par¬ 
ticle’s kinetic energy, its penetration into the domain r < r 0 is impossible. 

Reflection from the sphere takes place in the following way. The radial 
momentum component reverses its sign, while the tangential component is 
conserved, since, given radial symmetry of the potential, no forces can be 
perpendicular to the radius. The absolute value of the linear momentum is 
conserved, insofar as the impact is elastic and the kinetic energy does not 
change. A simple construction reveals that the impact parameter is related 
to the deflection angle by the dependence p = r 9 cos (%/2) if p < r f . 
Hence the general formula yields 

do = (rS/4)dQ 

so that the scattering is uniform over all angles, that is, it is isotropic. The 
total scattering cross section a is equal to Jtrg in this case, as could be expect¬ 
ed. Significant here is the fact that the interaction forces identically vanish 
at a finite distance. 

2. A collision of two particles is observed, and m 2 being their mas¬ 
ses (m x is the mass of the incident particle). As a result of the collision par¬ 
ticles are formed whose angular momenta lie at angles cp and to the angular 
momentum of the incident particle. Determine the energy Q by which the 
total kinetic energy of the colliding particles changes. Consider two cases: 
the collision yields (a) two particles of the same masses m t and m a and ( b) 
two particles of different masses m 8 and m 4 , in sum equal to mx + m a . 
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SMALL OSCILLATIONS 

In Section 4 we considered the oscillations of a pendulum. It was 
pointed out that they obey a complex law, which cannot be described 
with the help of elementary functions. The situation is simplified, 
however, in the limiting case of very small deflection angles of the 
pendulum from the vertical. This type of motion, known as small 
oscillations , is very important in applications of mechanics and, up 
to various approximations, is widely encountered in nature and in 
technology. For this reason the theory of small oscillations is treated 
here as a special section. 

Small Oscillations of a Pendulum. A simple graphic investigation 
reveals that in the most general case the motion of a pendulum is 
periodic. Figure 8 shows the curve U (<p) = mgl (1 — cos q>), which 



gives the relationship between potential energy and deflection angle. 
The horizontal straight line corresponds to a constant value of E. 
If E < 2mgl, the motion occurs periodically with time between the 
points —<p 0 and <p 0 . 

The problem is greatly simplified if <p 0 1 , that is, the angle is 

small in comparison with one radian. Then cos cp 0 can be replaced 
by a Taylor expansion up to the second term: cos qp 0 = 1 — qp^/2. 
Since | cp | < cp 0 , the expansion is also valid for cos cp. After this 
the integral (4.7) can be easily evaluated: 



dip 

(q>* —cp 2 ) 1/2 


(t) 


1/2 (n 

arc cos — 
<Po 


(7.1) 
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Inverting relation (7.1), we get the deflection angle as a function of 
time: 

(p = cp 0 cosf(g/Z) 1/2 (7.2) 

The result is a periodic function. As can be seen from (7.2), the 
deflection angle reverts to its initial value in time r = 2n(l/g )*/ 2 , 
which is known as the period of oscillation. The quantity ( g/l) l/ 2 is 
called the frequency of oscillation : 

C o = (g/l) il2 (7.3) 

This quantity gives the number of radians by which the argument 
of the cosine in (7.2) changes in one second. The term frequency is 
also used to denote the number of periods per second; in that case 
it is smaller than co by a factor of 2jt. The oscillation period r is 
connected with co by the relationship t = 2jt/co. It is significant that 
the period and frequency of small oscillations do not depend on 
their amplitude cp 0 . 

The General Problem of Small Oscillations with One Degree of 
Freedom. In solving the problem of small oscillations there is no 
need to first reduce to quadrature the problem of arbitrary oscilla¬ 
tions. We can initially simplify the Lagrangian. 

First, we note that all oscillations, large and small, always occur 
around a position of equilibrium (for instance, a pendulum oscillates 
about its vertical position). When deflected from stable equilibrium 
a system is subject to a force which acts in the opposite direction of 
the deflection, whatever its sign. This is known as the restoring force. 
At the equilibrium point the force is obviously zero, simply by de¬ 
finition of the concept of equilibrium. 

Force is equal to the derivative of potential energy taken with 
the opposite sign. We shall take the derivative with respect to the 
generalized coordinate. Then the equilibrium condition has the form 

f- = ° M 

Let us denote the solution of this equation q = q 0 . Assuming the 
system to have only one degree of freedom, we expand U (q) in 
a Taylor series in the vicinity of point q 0 up to the quadratic term 
to get 

• ■ ■ < 7 - 5 > 

In accordance with (7.4), the term linear in (q — g 0 ) vanishes. 
We denote (d 2 U/dq 2 ) q -_ q) by p. Then, confining ourselves to the 
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indicated terms of the series, we obtain 
U(q)&U{q 0 ) + |-(g-%) 2 
The force near the equilibrium position is 

F{q)=-^=-${q-qo) 


(7.6) 

(7.7) 


For this force to be a restoring force, that is, for the equilibrium 
to be stable, the following inequality must hold: 




(7.8) 


This is the stability condition for the equilibrium: the function 
U (g) must increase on both sides of the point q = g 0 . It follows 
that the potential energy at that point must have a minimum. This 
is shown in Figure 8 at <p = 0. 

Let us now examine the expression for kinetic energy. If in the 
general kinetic energy formula we substitute x = x (g), y = y (g), 
and z = z (g), then T reduces to the form 




The quantity in the brackets depends only on g; and so the kinetic 
energy of a particle can be represented in the form 




(7.9) 


Let us now expand the function a (q) in a series of q — g 0 in the 
vicinity of the equilibrium position: 

?’=4 a (7o)7 2 +4(-|-) <7=fo (7-go)g 2 +. • • 

In order that the particle should not move far from the equilibrium 
position, its velocity must be small. In other words, the member 

of the kinetic energy expansion a (q 0 ) q 2 /2 is already of the same 
order for small oscillations as the third term in the expansion of U , 
that is, P (g — g 0 ) 2 /2. When g = g 0 , all the energy of oscillation 
is kinetic, while at maximum deflection all the energy is potential. 
But since the total energy is the same at all points, we must assume 
that the potential and kinetic energies are of the same order of 
magnitude; this, as always in evaluations, refers to the leading terms 
in the expansion of T and U. Consequently, in the first approxima¬ 
tion the subsequent terms can be neglected, if they are not of some 
special interest. Later (after the explicit dependence of q on time is 



Mechanics 


73 


determined from the equation) it will be shown that the mean values 
of U and T are equal. 

In future, the coordinate q will be measured from the equilibrium 
position, that is, we shall put q 0 = 0. Then, omitting U (0) from the 
Lagrangian, we can write 

L = ia(0)^—ip 9 2 (7.10) 

From this we obtain the Lagrange equation 

«(0)V+P? = 0 (7.11) 


Denoting 

2 _p_-_L_ ( dtu \ 

a (0) a (0) \ dq 2 ) q =q 0 


(7.12) 


we reduce (7.11) to the usual form for the oscillation equation: 


<7 + G) 2 g = 0 (7.13) 

The general solution of this equation must contain two arbitrary 
constants. It may be written in one of three forms: 

q = C 1 cos cof + C 2 sin c ot (7.14a) 

q = C cos (c ot + y) (7.146) 

q =R e(CV a 0 (7.14c) 

The symbol Re ( ) signifies the real part of the expression inside the 
parentheses. The constant C' is complex: C" = C x — iC 2 . The con¬ 


stants C and y in solution (7.146) are known as the amplitude and 
the initial phase of the oscillation. They are connected with C x and 
C 2 by the known formulas 

C = (C\ + C 2 ) 1 / * —arc tan 

If we are only interested in the frequency of small oscillations and 
not the phase or amplitude, it is sufficient to use Eq. (7.12), verifying 
that the second derivative (d 2 U/dq 2 ) q = qo is positive. 

A system which is described by Eq. (7.13) is called a linear harmon¬ 
ic oscillator . 

It can be seen from Eqs. (7.10), (7.12), (7.146) that the averages 
of the potential energy and kinetic energy during one period are 
the same, because the averages of the squares of a sine or cosine are 
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equal to one-half: 

2 JI/(D 

sin 2 (co< + 7 ) — | sin 2 (at + y) dt = y 

0 

cos 2 (tof-f y) = Y 

T^U = Y a (°) “ 2C2 = T P C2 

Small Oscillations with Two or More Degrees of Freedom. We shall 
now consider oscillations with two degrees of freedom. As an exam¬ 
ple, let us first take the double pendulum (Sec. 3). If we confine 
ourselves to small oscillations, we must consider that the deflections <p 
and 'll) are close to zero (that is, the pendulum oscillates with small 
deflections close to the vertical). In that case we must substitute the 
equilibrium values, <p = = 0 and cos (<p — ty) = 1, into the 

expression for the kinetic energy in (3.23). The potential energy 
formula must be simplified in the same way as in the simple pendu¬ 
lum problem, that is, we must replace cos qp and cos \|) by 1 — qp 2 /2 
and 1 —1|? 2 /2. Eliminating the constant terms, we obtain the La- 
grangian in the form 

n= m+ 2 mi i 2 y 2 +?±-i\\ i > 2 

+ m i ll i yty— ™ + 2 mi lg<f— m, 2 Zlg - iji 2 (7.15) 
Let us examine this in a somewhat more general form: 

L = y (“11 <l\ + 2a 12?1 g 2 + a 22 q\) 

—<r(Pu7i +2p 12 g 1 g 2 + P 22 ? 2 ) — U (0) (7.16a) 

Taking into account the summation condition and omitting the 
constant term, we write 

L *— m 2 a iiv^v —2 PnWv (7-166) 

In this form it refers to a system with an arbitrary number of vibra¬ 
tional degrees of freedom. 

Comparing (7.15) and (7.16a), we obtain the values of the factors 
a^ v and P^ v for the case of a double pendulum: 

a ti = (m-l-m f ) l 2 , a i2 = m i ll i , ot 22 = 

fill = ( m -j-Mil) lg, Pl2 = 0, fi22 =m illg 

The fact that p i2 = 0 introduces no simplifications into the problem. 
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In the most general case the coefficients p n , p 12 , and p 23 are ex¬ 
pressed by the equations 




/ d*U \ 

\ dqi dq 2 ) 0 ’ 




where the equilibrium values are substituted into the derivatives; 
the former must be determined from equations similar to (7.4): 

= #- = 0, or f- = 0 (7.17) 

dqi dq 2 dq v 

For the equilibrium to be stable we must require the following 
inequality to be satisfied: 

U ( q) — U (0) = y (Pn7i + 2p l2 <7i<7 2 + P 22 ( ]\) > 0 


Given this condition, the minimum of U ( q) lies at point q 1 = 0, 

7a = 0. 

Let us rewrite the left-hand side of the inequality in identical 
form: 

4 (Piirf + 2 Pi2?i?2 + htti) = (71 + ) 2 + P22P<1 4~ P " 9* 


This expression remains positive for all values of q x and q 2 , provided 
the coefficients of both quadratics in q are greater than zero: 

Pn > o 

Piip22 —P? 2 >0 (7.18) 


The obtained inequalities can be easily generalized for the case 
of an arbitrary number of degrees of freedom. For this the coefficients 
must be written in the form of a square array: 


Pu | P12 

Pnv = P 21 P 22 

P 31 P 32 


P13 

P23 

P33 


(7.19) 


Then by the method of induction it can be concluded that all the 
determinants whose principal diagonal coincides with the principal 
diagonal of (7.19) must be positive for the quadratic form PnvQWZv 
to remain greater than zero at all q^, g v (the proof is presented in 
any course in higher algebra). In (7.19) these determinants are lined 
off below and on the right. As for the quadratic form for the kinetic 
energy, the positive values are automatically assured by the initial 
expression written in terms of Cartesian coordinates. In future we 
shall always assume conditions (7.19) to be satisfied. 
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We shall now write the Lagrange equations: 

dL • . dL * . 

-- a ll<7l + a 12^2i ”“T“ — a 12^1 + a 22(?2i 

d<7l dg 2 

— = Pn(Zi + Pia^ai 7^" = Pi*#i + P22^a» 

Whence 


dL 

—: a \x\Q\ 

d<l» 


dL 

c^n 


— Pnv^-v 


^ll^l + a 12^2 + Pll(?l + Pl2#2 — 0 

a i2^1 + a 22#2 + Pi 2^1 + P22^2 = 0 

or in general form (for any number of degrees of freedom) 


(7.20a) 


Ojiv^v + PnWv = 0 (7.206) 

In order to satisfy these equations, we shall look for a solution 
in the form 

g 1 =.4iej C0 *, q 2 = A 2 e i(Dt , q VL = (7.21) 

As in (7.14c), the real part of the solutions must be taken. 


The Frequency Equation. Substituting (7.21) into (7.20), and 
cancelling out c i(D< , we obtain the equations relating A 1 and 


(Pii — anC0 2 )^i + (Pi 2 — a 12 co 2 ) A 2 = 0 j 
(P12 — otia^ 2 ) A i + (P22 — cc 22 co 2 ) A 2 = 0 j 


(7.22 a) 


or in the general case 

(P^v — (o 2 a^) A v = 0 


(7.22 b) 


For a set of linear homogeneous equations to have a solution other 
than zero the system determinant must vanish: 


P 11 —ttuCO 2 !Pi„ — a 12 co 2 

p 12 —ai 2 a> 2 P 22 —a 22 a) 2 


== 0, or | Pn V — (o 2 an V | = 0 


(7.23) 


From this, as applied to (7.22a), we obtain the biquadratic equa 
tion 


( a ll a 22 — a?*) ® 4 — (Pll a 22 + p22«ll — 2a 12 p 12 ) (l) 2 

+ P11P22 — Pi* = 0 (7.24) 

For example, for a double pendulum Eq. (7.24) looks like this: 
mmJH* co 4 — (oi-f m t ) mjlf (l { 4 - 1) g oo 2 + (m -f mjm^^g 2 = 0 

If we introduce the abbreviated notations IJl = K, m^lm = \i spe¬ 
cifically for this problem, the expression for frequencies will be of 
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the following form; 

co 2 = 2^- {(1 + |i) (1 + X) ± [(1 + H) 2 (1 + k)*- 4X (1 + n )] :in } 

It is easy to see that this expression yields only the real values of 
the frequencies. However, we shall show this in more general form 
for Eq. (7.24). Let us assume that the following function is given: 

F (co 2 ) = (a 11 a a2 — ccf 2 ) co 4 — (p u a 22 + p 22 a H —2p 12 a 12 ) co 2 - 

+ Piip22-P?2 

which passes through zero at all points where Eq. (7.24) is satisfied. 
We see that F (co 2 ) is positive for co 2 = 0 and for co a = oo, since 
a n a 22 — a ?2 > 0 an d P 11 P 22 — P 12 > 0- Let us now substitute into 
this function the positive number co 2 = Pn/ a n« After a simple re¬ 
arrangement we obtain 

a iiF ("^7") = —( a i2Pn — Pi2 a n) 2< C0 

Thus, as co 2 varies from 0 to 00 , F( co 2 ) is first positive, then negative, 
and then again positive. Hence, it changes sign twice, so that 
Eq. (7.24) has two positive roots co 2 and co 2 , and, as was asserted, 
all the values for frequency are real. 

The quantity co has four values, both pairs of which are equal in 
absolute value. If we represent the solution in the form (7.21), it is 
sufficient to take only positive co. 

The proof that all co 2 are in the most general case positive, provided 
the quadratic form for U (q) is positive, is analogous, though more 
involved. 


Normal Coordinates. Substitute the roots of the frequency equa¬ 
tion into the set of equations (7.22). To each frequency there corres¬ 
ponds a definite ratio of the required quantities AJA X , or in the 
general case, AJA X . These ratios are equal to the ratios of the minors 
of the elements in the first row of determinants (7.23). For the case 
of two degrees of freedom the ratio £ £ = A^IA^ is apparent directly 
from Eq. (7.22): 




Pit — &>i a n 

Pi2 Ci)?CCi2 


£= 1,2 


(7.25) 


Here the index i corresponds to the number of the solution of the 
frequency equation (7.23). 

Each frequency co £ defines one partial solution of the set (7.20a) 
or (7.205). The general solution of a linear system of equations is 
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represented as a sum of partial solutions: 

qi = + 4 2) e i0)2 '. q-t, = + C^V® 2 ' 

C.* = i (7.2(>) 

h 


We must, of course, take only the real parts of these expressions. 

We now introduce the following notation: 

Q h = A^e**** (7.27) 

It is immediately apparent from this that Q ly Q 2 , and Q h satisfy 
the differential equations 

(?i + w i(^ = 0, (^2 + 0) 2^2 — 0, Qk’\- ( *>kQk = 0 (7.28) 

Each of these equations can be obtained using the Lagrangian 

(7.29) 

which describes oscillations with one degree of freedom. 

Thus, in terms of the variables Q iy the problem of coupled oscilla¬ 
tions with many degrees of freedom is reduced to the problem of 
independent oscillations of linear harmonic oscillators whose number 
equals the number of degrees of freedom of the initial oscillating 
system. Each harmonic oscillator is described in terms of the corre¬ 
sponding coordinate 

If we substitute the expressions (7.27) into (7.26), we obtain 
a relationship between the initial generalized coordinates and the 
quantities Q t , which are known as the normal coordinates . Thus each 
generalized coordinate appears as the sum of mutually independent 
normal coordinates varying according to a harmonic law. Usually 
the oscillation frequencies co* are incommensurable. But then the 
sum of expressions involving incommensurable frequencies is a 
nonperiodical function of time. 

Whatever the initial conditions, in Eqs. (7.20) it can never be 
assumed that one of the generalized coordinates remains unaffected 
by the oscillations. It is sufficient to have a system start oscillating 
at some instant with respect to even one degree of freedom correspond¬ 
ing to some generalized coordinate, for oscillations to start over all 
other degrees of freedom. This is due to the mixed components of the 

Lagrangian involving the products or connecting the 

degrees of freedom. Conversely, Q t and Qk^i are quite unconnected 
as long as the solution of the problem takes into account only the 
quadratic terms in the potential and kinetic energy. 
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From Eqs. (7.26), Q x and Q 2 can be expressed in terms of q x and q 2 
thus: 

. <?2= (7-30) 

If, for example, the initial values of q x and q 2 are so chosen that 

at that instant of time Q x = 0 and Q x = 0, the oscillation with 
frequency (Oj will not occur at all. For that it is sufficient to take 
at t = 0 the coordinates and velocities in such a proportion that 

? 27 i — <h = ® and — q 2 = 0. In other words, only the fre¬ 
quency co 2 is present, and the oscillations are strictly periodical, 
which does not occur in the case of arbitrary initial conditions. 

It is immediately apparent from the Lagrangian (7.29) that the 
total energy expression in normal coordinates reduces to the form 

« = + (7-31) 

i 

since L = T — Z 7 , and E = T + U. Thus, the energy of linear 
harmonic oscillators in terms of the coordinates Q t replaces the 
energy of coupled oscillations in terms of the coordinates q 

It should be noted that if the normal coordinates are expressed 
directly from Eqs. (7.27) or (7.30), the individual energy components 
are additionally multiplied by certain constant numbers a f . But if 
is replaced by Q t (a f ) i/2 , these numbers are eliminated from the energy 
expression, and it reduces to the form (7.31). An example of such 
substitution is offered in the exercise at the end of this section. 

Thanks to normal coordinates the consideration of oscillation 
problems is greatly simplified, because the linear harmonic oscillator 
is in many respects one of the most simple mechanical systems. 

The reduction to normal coordinates is essential in studies of the 
oscillations of polyatomic molecules, in the theory of crystals, and 
in field theory. In addition, normal coordinates are useful in techni¬ 
cal applications of oscillation theory.] 

The Case of Equal Frequencies. If the roots of Eq. (7.24) coincide, 
the general solution must not be written in the form (7.26), but 
somewhat differently, namely, 

x t = A cos cot -f- B sin cot 

y 2 = A' cos (ot + B 9 sin cot 

Four arbitrary constants appear in this solution, as it should be in 
a system with two degrees of freedom. 

An example of such a system is a pendulum suspended by a string 
instead of a hinge. In the approximation (7.32), it turns out that the 


(7.32) 
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pendulum describes an ellipse whose axes are tilted with respect to 
the x and y axes of the coordinate system and whose centre coincides 
with the origin. If in (7.5) account is taken of the subsequent terms 
in the potential energy expansion in powers of the deflection angle 
of the pendulum, we find that the oscillation is accompanied by a rota¬ 
tion of the axes of the ellipse. 


EXERCISE 


Find the frequencies and normal oscillations of a double pendulum, 
taking the ratios of load masses p = 3/4 and rod lengths X = 5/7. 

Solution . From the equation for the oscillation frequencies of a double 
pendulum, 


© 


a 

i 


Ll. 

2 l ’ 


<*>! 


l£ 
10 l 


Further, £i = —7/3 and £ 2 = 7/5. 

Let us now write the expression for kinetic energy. For simplicity, we 
write l = g = m = 1 so that only ratios X and p will appear in the equa* 
tions. This gives a u = 1 + p = 7/4, a 12 = \ik = 15/28, a 22 = M^ 2 = 
= 75/196, p n = 1 + p = 7/4, p 12 = 0, and p 22 = \iX = 15/28. 

Let us determine the coefficients a t . To do this we must calculate the 
kinetic energy: 


2r=j (Qi+Qtf + -jj- «?i+ Qz) (— y <?i ■ +jQz) 

+t£(~¥ < ? 1 + T < ? 2 ) =t^ +4< ^ 

Consequently, we must put a x = 3/2 and a 2 = 1/2. 

Denoting Q j Y 3/2 and Q 2 f 2 again by he letters and Q 2f we have 
the expression for potential energy 

r= T(-T^ 1+ T <?2 ) +15 (~TVT Ql +T^ 2 ) 


2 U-- 


-T«+T5« 


as it should be according to (7.10). The generalized coordinates are related 
to the normal coordinates as follows: 




5 /3 


/ 3 


Thus, if 7<p = —3\|) and 7<p = —3\p~ initially, then we have (? 2 = 0 
for all moments of time, so that both pendulums oscillate with one fre¬ 
quency ©i, with the constant relationship between the deflection angles 
7<p = holding all the time. Both pendulums are deflected to opposite 
sides of the vertical. The other normal oscillation, with frequency © 2 , occurs 
for a constant angular relationship 7<p = 5-ip. In this case the pendulum? 
are deflected to the same side. 
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NONINERTIAL FRAMES OF REFERENCE 

In Section 3 we stated the principle of equivalence of all inertial 
frames of reference, or the relativity principle. It is a reflection of the 
special significance of inertial frames in mechanics. 

However, the frames of reference we conventionally treat as 
inertial are actually only approximations. A frame fixed on the 
surface of the earth, for example, cannot be considered strictly iner¬ 
tial because of the earth’s diurnal rotation. That is the reason why 
the oscillation plane of a Foucault pendulum rotates with a speed 
depending on the geographical latitude (see Exercise 1). 

The rotation of the oscillation plane of a Foucault pendulum can¬ 
not be explained in terms of any interaction with the earth, because 
gravity cannot make the pendulum rotate precisely from east to 
west rather than from west to east 2 . However, when only a few 
oscillations are considered the rotation of the plane is still small 
and can be neglected. A reference frame fixed on the earth does not 
have time to display its noninertial qualities. There is always a cer¬ 
tain measure of error with which a given real reference frame approxi¬ 
mates an inertial system, and it depends on the duration of the 
motion process being investigated. 

Thus, the concept of an inertial reference frame is meaningful as 
an approximation and is an extremely convenient idealization for 
mechanics. In such a reference frame the interaction forces are 
measured in terms of the acceleration of bodies. In the absence of 
inertial frames we would have to treat the fundamental equation of 
mechanics (1.1) as merely a definition of force in general and could 
write it as an identity. Yet it is significant that, thanks to the pos¬ 
sibility of observing mechanical systems in inertial frames of ref¬ 
erence, Eq. (1.1) makes it possible to determine the measure of 
physical interactions between bodies. This measure does not depend 
on the choice of an inertial reference frame, which is precisely what 
the relativity principle is all about. 

The Galilean Transformations. The mathematical expression of 
the relativity principle consists in that the equations of motion 
written for one inertial frame retain their form in a transformation 
to another inertial frame. 

2 It is important that the oscillation plane of a IJoucault pendulum is 
strictly vertical, since otherwise the pendulum would have an initial angular 
momentum around the vertical and describe an ellipse whose axes would neces¬ 
sarily rotate (see end of Section 7). 

6—0452 
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The equations for transforming from one inertial frame to another 
can be obtained only on the basis of certain physical assumptions. 
In Newtonian mechanics it is always assumed that the forces of 
interaction between bodies, notably gravity, are transmitted instan¬ 
taneously over any distance. Therefore the displacement of a body 
immediately imparts a certain momentum to another body, wherever 
it is located. Thanks to this a clock located in an inertial frame can be 
immediately synchronized with a clock moving together with another 



Figure 9 


inertial frame. Thus, time in mechanics is accepted as universal; 
this is not a supplementary hypothesis but a corollary of the assump¬ 
tion of instantaneous action at a distance. In electrodynamics, where 
the speed of propagation of interactions is finite, time is not universal. 

In Newtonian mechanics, however, it is assumed that in going 
over from one inertial frame of reference to another, moving with 
a speed V relative to the first, time in both systems is the same, that 
is, that the readings of initially synchronized clocks always coincide. 
Later we shall see that the assumption is an approximation valid only 
when the relative velocities are substantially less than the speed of 
light. 

We shall now obtain the equations for transforming from one 
inertial frame of reference to another. Let the coordinate axes in 
both frames be drawn so that the abscissas are directed along the 
relative velocity V and the ordinates are mutually parallel. We can 
observe directly from Figure 9 that the abscissa a; of a certain point 
in the frame we agree to call fixed is connected with the abscissa x' 
in the moving frame by the simple relationship 

x = x' 4 Vt (8.1) 


provided the origins coincided at time t = 0. 
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The other transformation formulas are even simpler: 

y = y', z = z* (8.2) 

The relationship t = t' is -an expression of the hypothesis of the 
universality of time; the limits of its application are established by 
Einstein’s relativity principle. 

Condition (8.1) is absolutely symmetrical with respect to both 
inertial reference frames: if the one with the primed quantities is 
assumed fixed and the other moving, the form of (8.1) does not 
change, though, of course, V must be replaced by —F. 

In the present case symmetry is assured by the fact that t = t f . 
If t =^= t\ the transformation equations x = x + Vt and x' = 
= x — Vt' would contradict each other. That is, if time is not 
assumed the same in all inertial frames, the mathematical expres¬ 
sion of the relativity principle should be more complex than Eq. (8.1). 
But it would appear that (8.1) follows quite obviously from Figure 9. 
Here we must in large measure forgo the “self-evidence” which in 
fact stems from our everyday experience with velocities that are 
small in comparison with the speed of light. 

It can easily be shown on the basis of Eq. (8.1) why the Newtonian 
equations have the same form in all inertial reference frames. The 
forces of interaction depend on the relative coordinates of the parti¬ 
cles; they therefore do not change in the transformation (8.1), since 
the common term Vt cancels out in the argument of any function 
involving the differences between the coordinates. The left-hand 
sides of the Newtonian equations contain accelerations, that is, the 
second derivatives of the coordinates with respect to time. But since 
time is involved linearly in Eq. (8.1) and, according to the basic 

assumption, is the same in both frames of reference, x = x'. Hence 
the equations of mechanics are of identical form in any inertial 
reference frame. In other words, it is conventionally said that the 
equations of mechanics are invariant with respect to these transfor¬ 
mations, usually known as the Galilean transformations . 

The invariance of the laws of mechanics in the Galilean transfor¬ 
mations is the essence of the relativity principle of Newtonian me¬ 
chanics. It should be borne in mind here that the relativity principle, 
which is an expression of the equivalence of all inertial frames of 
reference, is a reflection of a much more general law of nature than 
the approximate formula (8.1). In application to electromagnetic 
phenomena this formula and the equality t = t' are replaced by 
much more general relationships, from which the Galilean transfor¬ 
mations emerge as a limiting case when the velocities of the particles 
and relative velocities of the reference frames are small in comparison 
with the speed of light. 


6 * 
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Rotating Frames of Reference. The relativity principle does not, 
of course, imply the equivalence of inertial and noninertial frames 
of reference. 

Notably, in transformations to rotating frames several new char¬ 
acteristic terms appear in the equations of mechanics, which we 
shall derive later. 

Drawing the z axis along the rotation axis, we denote the com¬ 
ponents of the radius vector r' in the fixed frame x\ y\ z', and in the 
rotating frame, x , y, z. As is known from analytic geometry the 
components are connected by the equations 

x = x' cos a + y' sin a 
y = — x ' sin a + y 9 cos a 
z = z 9 

where a is the angle of rotation. 

Differentiate these equations with respect to time and denote the 

derivative a by the letter 00 . This is the angular velocity of rotation, 
or the number of radians per second. We have 

x = x 9 cos a + y' sin a — oxr' sin a + coj/' co s a 

• E • 

= x 9 cos a + y* sin a -\-(oy 
y= — x r sin a + y’ cos a — 002 ' cos a — coz/' sin a 
= — ar'sin a + y 9 cos a — cox 
z = z 9 

Suppose a line segment of length co is laid oS along the z axis. 
Considering it as a vector, we have co 2 = | co |, co* = 0, co y = 0. 
Having the vector co, we can replace the product (oy by — (co X T )x> 

and —(dx by — (c*> X r) y . The terms involving x f , y\ and z' are 
essentially the projections of the particle’s velocity in the inertial 
frame transformed to the rotating reference frame. Calling them for 

the time being x TO t » S/rot* z roti and expressing them with the help 
of the obtained equations, we have 

Xrot = X + X r )x 


I/rot = V + (© X r )y 

z' TO t = Z (8.3) 

These equations can be combined with the help of one vector equa¬ 
tion 


r ' = r -f co X r 


: (8-4) 
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Now it is not hard to find the Lagrangian for the variables re¬ 
ferring to the rotating reference frame. First, it is obvious that 


•rot = *rot + y'rh + Zrot = x' 2 + y' 2 + z' 2 = r' 2 

since the absolute value of any vector is the same in any coordinate 
system. Consequently, in transforming from the variables in the 
inertial frame to the variables in the rotating frame we obtain the 
required expression for L : 

L =^?*-U = ^ti t -U = ^(r + <»Xr) 2 -UW ( 8 . 5 ) 

For the sake of simplicity L is written for one material point. 

Now let us write the Lagrange equation for motion relative to the 
rotating reference frame, that is, taking x, y, and z as the generalized 
coordinates. Returning for the time being to the components chosen 
in developing Eq. (8.3), the Lagrangian in terms of the components 
of the vector takes the form 


L =^r [(x—coy) 2 + (y + ( 0 x) 2 + z 2 ] — U ( x , y , z) (8.6) 

Whence we obtain 

dL * x dL /* , . dL 

—-= m (x — coy), — = m(y+(dx), —=mz 

dx dy dz 

dL m , * , . dU dL • . dU 

— = mco (*/ + «*)-—, _= -m©(*-<oy) —^ 

dL _ dU 

dz dz 


The Lagrange equations in terms of the components are 
(x — coy) — raco (y + (ox) — mcoy + = 0 


m 


• • • • • QJJ 

m (y -f (ox) + mco (x — coy) + mcox + = 0 

'"*+■£■= 0 


Let us leave only the second derivatives on the left and rewrite the* 
last three equations as a single vector equation: 

mr = mr X <*> + 2mr X <*> + m( » X ( r X ©) — ^ (8.7) 

Expanding the double vector product by means of the equation 
A X (B X C) = B (A-C) — C (A - B), and transforming to com¬ 
ponents, we see that (8.7) is equivalent to the preceding system of 
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three equations. As always, the advantage of vector notation is that 
it does not refer the equation to a specific coordinate system and is 
much more graphic. 

Inertial Forces. The first three terms on the right in (8.7) essen¬ 
tially distinguish the equations of motion, written relative to a 
rotating frame of reference, from the equations written relative to 
an inertial frame. 

The use of a noninertial frame is determined by the nature of the 
problem. For example, if the motion of terrestrial bodies is being 
studied, it is natural to choose the earth as the frame of reference, 
and not the sun. If we consider the reaction of a passenger to a train 
that suddenly stops, we must take the train as the frame of reference 
and not the station platform. When the train is braked sharply, the 
passenger continues to move forwards “inertially” or, more precisely, 
at the initial time of braking still has the velocity of the train’s 
uniform motion. Thus, relative to the carriage, there is the familiar 
jerk forward. Obviously, the noninertial frame is the train and not 
the earth, since no one experiences any jerk on the platform. 

The additional terms on the right in Eq. (8.7) have the same 
origin as the jerk when the train stops—they are produced by the 
noninertial character (in the given case, rotation) of the frame of 
reference. Naturally, the acceleration of a point caused by the 
noninertial character of the reference frame is absolutely real, relative 
to that frame, in spite of the fact that there are other, inertial, frames 
relative to which this acceleration does not exist. In equation (8.7) 
this acceleration is written as if it were due to some additional forces. 
These forces are usually called inertial forces. Unlike interaction 
forces, inertial forces are proportional only to the mass of the body 
to which they are applied. This is natural, since the accelerations 
caused by the noninertial character of a reference frame are, by de¬ 
finition, the same for all bodies placed at the same point of the ref¬ 
erence frame and moving in the same way within it. The term 
“force” is applied to them because the respective expressions are 
proportional to the product of the mass times the acceleration. 

There is only one natural force possessing this property, Newto¬ 
nian gravitation. The accelerations of all free falling bodies are, 
as is known, the same. If in Eq. (8.7) we put U = mgz, the mass 
of the moving body cancels out, yielding a universal law of motion 
that does not depend on the body’s mass. The same would refer to 
the case of U = 0. In a noninertial frame, bodies accelerated by 
inertial forces move neither rectilinearly nor uniformly. Thus there 
exists a remarkable likeness between a body freely moving in a non¬ 
inertial frame and a body subject, in addition to the usual forces, 
to^ the action of gravity. Einstein’s theory of gravitation is based 
on this experimental fact. 
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Let us now consider in more detail the inertial forces appearing in 
(8.7), which are due to the rotation of the reference frame. The first 
term on the right in (8.7) occurs as a result of the angular velocity 
being variable. It will not interest us. The second term is called the 
Coriolis force. For a Coriolis force to appear, the velocity of a point 
relative to a rotating reference frame must have a nonzero projection 
on a plane perpendicular to the axis of rotation. This velocity 
projection can, in turn, be resolved into two components: one per¬ 
pendicular to the radius drawn from the axis of rotation to the 
moving point, and the other directed along the radius. The most 
interesting, as to its action, is the component of the Coriolis force 
due to the radial component of velocity. It is perpendicular both to 
the radius and to the axis of rotation. If a body moves perpendicular 
to the radius and to the axis of rotation, then the Coriolis force is ra¬ 
dial and is compounded with the centripetal force, which will be 
considered further on. 

We note that the Coriolis force cannot be related, even formally, 
to the gradient of a potential function. 

There are many examples of the action of the Coriolis force in 
nature. The water of rivers in the Northern Hemisphere that flow 
in the direction of the meridian, that is, from north to south or from 
south to north, experiences a deflection towards the right bank (look¬ 
ing in the direction of flow). This is why the right bank of such rivers 
is steeper than the left. It is easy to form the corresponding compo¬ 
nent of the Coriolis force. The angular velocity vector of the earth’s 
rotation is directed “upwards” from the north pole. The waters of 
a river flowing southwards at the middle latitudes of the Northern 
Hemisphere have a velocity component perpendicular to the earth’s 
axis and directed away from the axis. This means that the Coriolis 
acceleration of the water, relative to the earth, is in a westerly di¬ 
rection or, relative to a river flowing southwards, to the right. If 
the river flows in a northerly direction, the deflection will be towards 
the east, that is, again to the right. In the Southern Hemisphere the 
deflection is to the left bank. 

The Coriolis force substantially affects the motion of air and water 
masses of the earth, even though it is very small in comparison with 
the force of gravity. Indeed, the angular velocity of the earth as it 
makes one rotation in 24 hours is slightly less than 10 -4 radians per 
second, while the velocities of particles of water or air may range 
from 10 to 10 4 cm-s _1 (the latter in winds of hurricane force). Hence 
the Coriolis acceleration may range from 10 -3 to 1 cm-s" 2 , or from 
one-millionth to one-thousandth of the acceleration of gravity. 

The Coriolis force also causes the rotation of the plane of oscilla¬ 
tion of the Foucault pendulum, which is used to prove the earth’s 
rotation about its axis without resorting to astronomical observa¬ 
tions. From the dynamic point of view the choice of the reference 
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frames to be taken as inertial and rotating is highly relevant. The 
third vector term in Eq. (8.7) is the usual centrifugal force. Indeed, 
it is perpendicular to the axis of rotation and in absolute value is 
equal to 

| mco X(rX (d )l = m | (l >|| r X (d l = m(0 ( (0r sin P) = ^o) 2 r sin P 

( 8 . 8 ) 

Here, angle p is the angle between the radius vector r of the given 
point and the rotation axis; the origin of the coordinate system 
lies on this axis. The first equality takes account of the fact that the 
vectors co and co X r are perpendicular to each other, so that the 
absolute value of the vector product is equal to the product of their 
absolute values. 

But r sin p is equal to the distance from the axis of rotation, so 
that this force satisfies the conventional definition of a centrifugal 
force. 


EXERCISES 

1. Consider the rotation of the oscillation plane of a Foucault pendulum 
under the action of the earth’s rotation about its axis. 

Solution . At a given'point of the earth we direct the x axis to the north 
and the y axis to the east. Then, if the vertical angular velocity component 
(o v = co sin 0, where 0 is the latitude of the location, and co is the angular 
velocity of the earth’s rotation, we have the equation of motion 

• • • a • • a 

x = — mix — 2i/co v , y = — (ofi/ + 2x(Ov, (o§ = y 

Multiplying the first equation by y and the second by x and then sub 
tracting, we get 

•jf(yx — xy)= — ^(y 2 + x 2 )(Oy 

Integrating and transforming to polar coordinates (x = r cos <p f y = r sin <p) f 
we have 

r 2 <p = r 2 (0v 

Whence, after cancelling out r 2 , we have 

(p = (D v 

which gives the angular velocity of rotation of the oscillation plane. 

2. Determine the deflection of a falling body to the east. 

Solution . Here the horizontal component ( 0 ^ = co cos 0 of the Coriolis 

force rather than the vertical is important. For deflection x we obtain 

x = 2zcoh = 2^ (Oh 
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whence 



Expressed in terms of the height of fall the equation is as follows: 
1 ( 2 z) 3/2 


g 


1/2 


■ 


Substitution of real figures reveals that observation of such deflection is 
difficult. 


9 


DYNAMICS OF RIGID BODIES 

The dynamics of a rigid body is a large independent chapter of me¬ 
chanics and is very rich in technical applications. Our aim is to 
give only a brief account of the basic concepts of this branch of 
mechanics, inasmuch as it contains instructive examples of general 
laws. In addition, certain mechanical quantities that characterize 
a rigid body are necessary for an understanding of molecular spectra. 

The Kinetic Energy of a Rigid Body. As was shown in Section 2, 
a rigid body has six degrees of freedom. Three of them relate to the 
translational motion of the centre of mass in space. The remaining 
three correspond to rotation of the body relative to its centre of mass. 

In Section 4, it was shown that the kinetic energy of a system con¬ 
sists of the kinetic energy of the motion of the whole mass of tho 
body concentrated at the centre of mass and the kinetic energy of tho 
motion of the separate particles relative to the centre of mass. In the 
case of a rigid body, relative motion reduces to rotation with tho 
angular velocity o> the same for all the particles. Naturally, both 
the magnitude and the direction of o> may vary with time. 

Let us calculate the kinetic energy of rotation of a rigid body. 
In the general case, the density p of the body may not be uniform 
over the whole volume of the body, and may depend on the coordi¬ 
nates: p = p (x, y, z) = p (r). The mass of an element of volume dV 
is equal to dm = p (r) dV . The velocity of rotation v is, from (8.4) r 
0) X r ( we neglect translational motion). Hence, the kinetic energy 
of unit mass of the body is 

(cd X r ) 2 = ( ° 2/ * 2 sin 2 P = <*> 2 r 2 — <o 2 r 2 cos 2 (3 
= co 2 r 2 — (to -r) 2 
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The kinetic energy of the whole body is represented by the integral 
of this quantity over the volume: 


T = -r$ P(*Xr)W 


(9.1) 


Expressing the square of the vector product in terms of the compo¬ 
nents co 4 and r v , we have 

co 2 = co 2 + coy + , cor — (o x x + c o y y + c o z z, r 2 = x 2 + y 2 + z 2 

We now make use of the summation rule: 

co 2 = 0 )^ 0 )^ cor = co^, r 2 = 

In forming the products r 2 co 2 and (co-r) 2 the summation index should 
be denoted by different letters so as it would occur nowhere more 
than twice. Hence 


(c° X r ) 2 — CO^CO^XvXv |xCOy»T v 

Let us introduce a symbol that will prove highly useful for the 
subsequent discourse: 


f 1 at JlI = v 

^ lv { 0 at p =£ v 

With the help of this symbol co^co^ 
<*W == 


(9.2) 


can be identically rewritten as 


and the square of the vector product as 
(coXr) 2 = (o^co v (r 2 5^ lv — x a x x ) 

The components of the angular velocity of a rigid body are con¬ 
stant over its volume. Consequently they can be taken out from under 
the integration sign in (9.1), and the kinetic energy is reduced to 
the form 

T = y (D(j(D v j* p (8^ lv r 2 — x^x v ) dV (9.3) 

All the integrals involved in (9.3) depend only on the shape of the 
body and the density distribution in it; in a reference frame moving 
together with the body they do not depend on its motion. Let us now 
write all six factors of co^co v in explicit form, remembering that 

X \ X , #2 = y » Xq = Z , ^22 833 ^» ^12 = ^13 ^23 = 

In = Ixx= j p(r 2 —x 2 )dF== j p (y 2 + z 2 )dV 

Ivi — lyy = \ P( r2 — y 2 ) dV = J p(x 2 + z 2 )dV 

I 33 = I zzSS \ p (r 2 — z 2 ) dV = j p (y* + x*)dV 
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and 

I 12 ==I xy = — J p xydV 
A3 = Az =— j P xzdV 
As = Az=— j PJ/zdF 

The quantities with the same subscripts are called the moments of 
inertia , those with different subscripts are the products of inertia . 
Note that = / V n- Using the notation we find that the 
kinetic energy of a rotating solid body can be rewritten in compact 
form as 

r=y/ MV tt>n(0 v (9.4) 

Tensors. Before continuing with rigid-body dynamics, let us ex¬ 
amine in greater detail the geometric nature of the quantities I^ v . 
Unlike a vector, whose components have single subscripts, I^ v has 
two subscripts. 

It was pointed out before that a vector quantity is characterized 
by its transformation law in the rotation of a coordinate system. 
Denoting the cosines of the angles between the old and new axes by 
the symbols (p, v) (see Exercise 3, Section 4), we write the transfor¬ 
mation law applied to the vector components: 

<l = (P, v)x v 

Let us now show that the double-subscript quantity 1^ transforms 
according to the law 

iw — (m h) /\x 

We start with the symbol 6^ v . Consider three unit vectors directed 
along the axes of the turned coordinate system. For them we have 

n[ a) = 1, ra' (1) = 0, n' 3 (1) = 0 

ai ; (2) = 0, m ' (2) = 1, n ' 3 i2) = 0 

n \ ( 3) = o, n; (3) = 0, n 3 i3) = i 

Form their scalar products 

n /(1) n ,(1) = n ,(2) n ,(2) = n ,(3) n ,(3) = 1 
n /(1) n ,(2) = n ,(1) n' (3) = n' (2) n ,(3) = 0 
For these products we can introduce the concise notation 
n'(^)n' (v >= 8^ 
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A similar equality must hold in any system, notably the un¬ 
primed one. Now transform the unit-vector projections from the 
primed to the unprimed system. From the general formulas of trans¬ 
formations of vectors we obtain 

n x <l ° = (a, x) = (fi, x), nf* = n?’ (a, x) = (v, x) 

From these projections we again form scalar products, and we take 
advantage of the fact that they are the same in any coordinate sys¬ 
tem. Now we find that the symbol 6^ satisfies the equation we had 
to prove for 1^: 

6nv = n'(» 1 )n' (v) =n^ ) «x V) = (H, «)( v , x) = (|x, *) 8x* 

In other words, this means that S^v retains its fundamental property 
in a transformation to another coordinate system. 

The second term under the integral in (9.3) is the product of 
components, each of which is transformed according to a vector 

law. Therefore their product transforms in the following manner: 

^*v=(p, ^)(v, y)x%Xy, 

Consequently, the integrand in (9.3) transforms" with the help 
of coefficients (p, ^)(v, x). Since they are constant for each given 
rotation of the coordinate system, they can be taken outside the 
integral sign, thus yielding the required transformation law for 7 Xx . 
All quantities with such a transformation law are called tensors of 
rank 2. According to this terminology a vector should be called a 
tensor of rank 1, and a scalar has rank zero. Tensors may be of higher 
rank than 2. The rank is defined by the number of symbols p, v, . . . , 
involved in the transformation law or, what is the same, by the 
number of indices occurring once in the tensor expression. 

If an index is repeated in a tensor expression, it means that summa¬ 
tion with respect to it has occurred and the rank of the initial tensor 
has been reduced by two. Take, for example, a tensor of rank 3, 
say, ^4pivx- ^ transforms according to the law 

^Jivx = (p, a) (v, P) (A, y) i4 aPv 

Let us sum over the indices p and v, that is, write them twice on 
both sides of the equation. Making use of the property of (p, a)(p, P) 
just proved, we reduce the transformation law written above to 
the form 


(p, a) (p, P) (X, y) = 6 aP (X, y) A aPv = (X, y) A aay 

But this is the transformation law of a vector, that is, a tensor of 
rank 2, so that the rank of the initial tensor has been reduced by two 
units, as was asserted. 
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The kinetic energy expression (9.3)-(9.4) contains all the indices 
in pairs, that is, it is a tensor of rank zero: it is a scalar, which, of 
course, is just as it should be. 


Angular Momentum of a Rigid Body. Let us now calculate a 
projection of the angular momentum of a rigid body. From the 
definition of angular momentum we have 

M x = J p (r X v)*dV = j p [r X (© X r)l x dV (9.5) 

Expanding the double vector product, we reduce M x to the form 

M x = j p (g) x r a — xcor) dV 

= (D* f p (y 2 + z a ) dV — (Dy [ p xy dV — co z f p yz dV 

J (9.6a) 

or 

M a = / a pG)p (9. 6b) 


The expression of angular momentum again involves the compo¬ 
nents of the quantity /, called the inertia tensor . 

Comparing (9.6a) and (9.4), we see that 

<§7 < 9 ' 7 ) 


M x = - 


and M v and M z appear analogous. In vector form all three equations 
can be written as 



(9.8) 


Equations (9.7) and (9.8) again express the fact that angular mo¬ 
mentum is a generalized momentum related to the rotation of a body. 
In this sense (9.7) is analogous to (5.4). There is, however, a signifi¬ 
cant difference, for the components co a are not the total time deriva¬ 
tives of any quantities. (This will be demonstrated further on in this 

section.) Therefore co* in (9.7) is not altogether similar to (p in (5.4). 

It is apparent from Eq. (9.6a) that in the most general case the 
angular momentum vector is not parallel to the angular velocity 
vector. Parallel vectors, as is known, must be linked by a relation 
ship of the form M a = /co a , where / is a scalar, not a tensor. 


Reduction of the Inertia Tensor to the Principal Axes. Nevertheless, 
to each specific inertia tensor there correspond certain directions in 
space, such that if the angular velocity vector is directed along them, 
the angular momentum is in the same direction. As we have just 
seen, the.condition that the angular momentum and angular velocity 
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vectors are parallel is expressed thus: M a = I co a . Substituting the 
angular momentum according to the general formula (9.66), we 
obtain a set of three equations of the form 

/ aP 0 )p = /( 0 o (9.9a) 

or (in terms of the components): 

+ ^12 (0 2+ /l3 (0 3 == ^ (0 l 'j 

^ 21^1 H ” -^ 22^2 ^ 23^3 = i ( 9 . 96 ) 

132^2 ^ 33^3 = J 

For these linear homogeneous equations to have solutions their 
determinant must be zero: 


/ 



/12 

122-I 



1 32 


133-I 


(9.10) 


All three roots of this equation are real and positive numbers. This 
can be shown by reasoning in the same way as in Section 7, when 
we proved that co 2 is positive in a system performing small oscilla¬ 
tions (for the case of two roots). We shall not give the proof here. 

Suppose that in the most general case all three roots of Eq. (9.10) 
are different. In substituting them into Eq. (9.96) the components co 1? 
co 2 , and co 3 must be taken proportional to the minors of one of the 
rows, for example the first. Let co ( j> and <oW be solutions corre¬ 
sponding to two different values I iy I h . (The Latin index numbers the 
solutions of Eq. (9.10), the Greek index denotes the vector compo¬ 
nent; here the summation rule refers only to the indices numbering 
the components.) 

Now write Eqs. (9.9a) for two different i , k: 

= I i<Oa\ /aP<4f° = Ih^a 

Multiply the left equation by co^ and the right one by co£\ and 
subtract one from the other to get 


7aptt>a W P 7aptt>a W P — (7 i fe) 

For the left-hand side of this equation we make use of the basic 
property of indices over which the summation is performed: any pair 
of identical indices in any term of the equation can be denoted by 
another letter without changing the indices in any other terms of the 
equation. This is due to the fact that the denomination of an index 
passing from 1 through 3 in summation is quite immaterial. Ac¬ 
cordingly, in the second term on the left we redesignate a as p, and 
P as a. We also take advantage of the fact that /p a = / a p, that is. 
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the inertia tensor is symmetric with respect to its indices. Hence 
/ap<Oa (1$° = 

Then the corresponding terms cancel out, and we find that 

(/»-/*)<W = 0 (9.H) 

But we agreed in advance that the roots of Eq. (9.10) are different, 
hence the first term in (9.11) is nonzero. Consequently the scalar 
product vanishes, and vectors co (i > and are mutually 

perpendicular. Thus, for every point of a rigid body there are three 
mutually perpendicular lines, such that in a rotation about them the 
directions of the angular momentum and angular velocity coincide. 
These three lines are called the principal axes of inertia at the given 
point, and / x , / 2 , / 3 are the principal moments of inertia . 

Suppose that two of the three principal moments of inertia are 
equal. According to (9.11) the directions of the corresponding prin¬ 
cipal axes are perpendicular to the third axis, for which the moment 
of inertia is not equal to the other two. We draw a plane perpendicu¬ 
lar to the third axis through the origin of the coordinate system. 
Then in a rotation of the body about any line in this plane the angular 
momentum M is directed along the rotation axis. Thus, if I x = / 2 , 
any two mutually perpendicular lines lying in the plane passing 
through two axes of inertia can be taken as principal axes. In other 
words, if the axis of rotation lies in the plane of two equal moments 
of inertia, then the vectors M and co are parallel. 

If all three principal moments of inertia coincide, I x = / 2 = / 3 , 
then in a rotation about any axis the angular momentum M is di¬ 
rected along that axis. 

Now write the expression for the kinetic energy of rotation, as¬ 
suming that the coordinate axes coincide with the principal axes 
of inertia. From the definition of principal axes of inertia we have 

M 1 = I x coj, M 2 = / 2 co 2 , M 3 = / 3 co 3 (9.6c) 

These equations are obtained if an arbitrary vector of angular 
momentum M is resolved along the principal axes. The fact that in 
the most general case each component is multiplied by its number I t 
is additional indication that vectors M and co are not parallel. 
We make use of Eqs. (9.7), which yield 

Integrating, we obtain the required expression for the kinetic energy: 

T = ~2 + ^2^2+ !&>*) (9.12) 
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This equation can be graphically interpreted in the following way. 
Lay off the angular velocity components co lt co 2 i and co 3 along the 
axes of an arbitrary coordinate system. Then at constant T Eq. (9.12) 
represents a triaxial ellipsoid. Knowing the direction of the rotation 
axis and the kinetic energy of rotation, we can find all three angular 
velocity components according to the point of intersection of the 
axis with the ellipsoid. If two moments of inertia are equal, the 
triaxial ellipsoid becomes an ellipsoid of rotation; if three moments 
of inertia are equal, it becomes a sphere. It is clear from this why 
the choice of principal axes of inertia in a plane of equal moments 
of inertia is arbitrary. 

Euler’s Equations of Motion for a Rotating Rigid Body. We shall 
now develop equations that show how the angular momentum M 
varies with time. For a separate mass point we have 

^-= 4-( r XP) = rX!p + r|X P = r[X F 

where the first term vanishes, since r and p are parallel. Integrating 
this equation over the volume of the rigid body and taking advantage 
of the additive property of angular momentum, we have 

M=j(rXF)dF = K (9.13) 

The vector in the right-hand side of (9.13), which we denote by K, 
is called the resultant torque on the body . If F is the gravitational force 
(which occurs in the majority of cases), then torque K can also be 
written as 

K = - j PS (r X *«) dV 

where z 0 is the unit vector in a vertical direction. But since z 0 is 
a constant, it should be taken outside the integration sign, and for 
the torque due to gravitational forces we obtain the expression 

K = z 0 X J PF dV 

Suppose a body is supported at its centre of mass or, what is the 
same thing, at its centre of gravity. Then the resultant gravitational 
force is balanced by the reaction of the support, and the resultant 
torque vanishes, since from the definition of the centre of mass the 
integral for all three projections is zero. If K = 0, the rigid body 
rotates as a free body. This case of motion occurs in a demonstration 
gyroscope. 

For the angular momentum of a rigid body to be conserved all 
that is required is for the resultant torque to be zero. The angular 
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momentum vector of an arbitrary mechanical system is conserved if 
there are no external forces at all (with the exception of the case of 
the whole system moving in the field of a fixed attractive centre). 

Equation (9.13) is inconvenient for the case of a fixed frame of 
reference. In such a system the inertia tensor linking vectors M 
and co, according to Eq. (9.66), itself becomes a variable, because 
the integrals involved in Eq. (9.3), taken with respect to a fixed 
frame, are in the most general case variable over the volume of the 
rotating rigid body. It is preferable to refer the equation to a ref¬ 
erence frame fixed relative to the body, taking into account the 
frame’s accelerated motion. The change of vector M relative to the 
moving axes consists of two components: one due to the change of 
the vector itself, the other to the motion of the axes on which it is 
projected. For vector M the latter change is equal to <o X M, just as 
for the radius vector r in Section 8 it was equal to (D X r - (I n a rota¬ 
tion of a coordinate system any vector varies as a radius vector.) 
Then Eq. (9.13) written with respect to moving axes looks like this: 

®+fi>XM = K (9.14) 

where the vector of torque should also be projected on the moving 
axes. 

What, consequently, is required is a set of kinematic equations 
defining the direction of the moving axes relative to the fixed ones. 
Two systems of axes can be related by means of nine cosines or, in 
our notation, symbols (pi, x). As was shown, this symbol is the 
projection of a unit vector n (x) on the moving axis numbered p,. 
The unit vector n (x > does not change relative to the fixed axes: 

n(fc) = 0. Writing the derivative n (x) relative to the moving axes, 
we obtain 


dn < x > 
dt 


a) x n<x) = o 


(9.15) 


Thus, for the nine cosines we have three vector equations, that is, 
nine equations. Let us show that these equations do not violate the 
basic property of unit vectors: n< x ) = 8 xX . For this write 
Eqs. (9.15) for two different unit vectors n (x) and n (X) : 


dt 


(o X n (x) = 0 


dnM 

dt 


to x n(X) = o 


7-0452 
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Multiply the first by scalarly, the second by n (x) , and add. 
After a cyclic permutation in the mixed products, we obtain 

n (^) d —~ + n (x) dniK) = — 
dt ^ dt dt 

= - CO (n (x > Xy K) ) - © (nW X|n (;t) ) - 0 

since n (x) X n(X) — — n (X) X n(x) * Hence, if the condition 
n (x) n (X) = 8 xX was satisfied at the initial time, it holds subsequently, 
as was asserted. 

Equations (9.6c), (9.14) and (9.15) fully define the position of a 
rigid body in space. 

It is convenient to eliminate the angular momentum components 
at once with the help of Eqs. (9.6c), thereby referring the motion to 
the principal axes of inertia, which are fixed relative to the body. 
Then instead of Eq. (9.14) we obtain the following set: 


I l^i (^3 ^ 2 ) W 2 W 3 = 

^2 0) 2 + (^i — 1 3) (OgCDi = |/f 2 * 

■^3^3 (12 — -^l) C0iC0 2 = -^3 > 


(9.16) 


These equations, known as Euler's equations , can be reduced to 
quadratures for arbitrary values of the integrals of the motion in the 
following main cases: 

(i) The point of support of the body lies at the centre of mass, so 
that K x = K 2 = K 3 = 0; the relationship between the principal 
moments of inertia is arbitrary. 

(ii) I x = / 2 =£ / 3 , and the point of support lies on the symmetry 
axis relative to which two moments of inertia are equal; the point 
of support does not coincide with the centre of mass. This is known 
as a symmetric top. 

(iii) I 1 = 1 2 = 2/ 3 ; the centre of mass lies in a plane through 
the point of support perpendicular to the axis of symmetry. This is 
Kovalevskaya's top . 

It has also been shown that the set of equations (9.16) can be 
integrated in quadratures with arbitrary integrals of the motion only 
if conditions (i), (ii), or (iii) hold, and in a few similar cases. 


The Free Top. Case (i), when K = 0, is integrated analytically 
in very complex form. We shall therefore consider only some general 
properties of the rotation of a free top, which can be obtained by 
investigating the integrals of the motion. 

In a fixed frame of reference the angular momentum vector M is 
conserved. In a noninertial frame fixed with respect to the top, 
M is, of course, not conserved. But since the rotation does not affect 
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the absolute value of the vector, the square of the angular momen¬ 
tum, M 2 , must be conserved in a rotating system. This is easily 
proved directly by multiplying the first equation in the set (9.16) 
by /icoj, the second by / 2 (o 2 , and the third by / 3 (o 3 . Then, adding 
all three equations, at K = 0 we obtain 


d_ 

dt 


(7?co? 


iW 3 )=4rW + 


M\- 


M\) = 


dM 2 
dt 


= 0 


It follows from this set of equations that the kinetic energy T 
is also conserved. The equations should be multiplied by o^, co 2 , co 3 
and added, yielding T = constant. 

Let us now express the projections of the angular velocity vectors 
in terms of the angular momentum projections with the aid of (9.6c) 
and substitute them into the expression for kinetic energy. Then 
(together with the condition that the square of the angular momen¬ 
tum is conserved), we obtain two equations: 


M\ M\ Ml _ T 

211 “ T " 2/ 2 2/ 3 


(9.17) 


M\ + Ml + M l = M 2 


(9.18) 


If segments M x , M 2 , and M 3 are laid off along the axes of a coor¬ 
dinate system, then (9.18) is the equation of a sphere and (9.17) 
is the equation of a triaxial ellipsoid. Vector M must satisfy both 
equations, that is, it should lie on the intersection of both planes. 

The intersection lines are different depending on where they lie. 
Their geometry is shown approximately in Figure 10. Near the 
major axis the ellipsoid intersects with the sphere, and the inter¬ 
section line is consequently a closed curve. Near the minor axis the 
ellipsoid is flattened most, so that the sphere emerges from it, and 
the intersection lines are again closed. Near the median axis there 
is a section of the ellipsoid with greater curvature than the sphere 
and one with smaller curvature. As a result the intersection line 
through the median axis is a cross, while close to it the lines are 
like hyperbolas, with the crossed lines as their asymptotes. 

It follows from this construction that if the angular momentum 
vector lies somewhere near the major inertia axis, it describes a closed 
curve around it. In a fixed frame of reference, where the angular 
momentum vector does not change, the major axis of the ellipsoid 
rotates around the angular momentum vector along a closed curve. 
Hence rotation about the major axis of inertia of a body is stable. 
The same is true of rotation about the minor axis of inertia. As for 
the median axis, the curves are open, so that the angular momentum 
vector does not stay close to the axis, and rotation about it is un¬ 
stable. 


7 * 
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For the subsequent investigation it is convenient to lay off the 
components of the angular velocity vector , (o 2 , and co 3 along the 
coordinate axes rather than the angular momentum components. 



Figure 10 

We write the energy and angular momentum equations as follows: 

/io)f . / 2 o)i . / 3 co| _ T 
2 ' 2 ' 2 

/?©; + iw* + 

Let us find the direction cosines of the normal to the ellipsoid, 
the ellipsoid which expresses the conservation of energy. We obtain 

ros „ _A©!_ = /iCQ! _ Mi 

1 M M 

and analogously for the other components. But the ratio MJM is 
the cosine of the angle between the angular momentum vector in the 
fixed frame of reference, where it is conserved, and the moving axes 
connected with the body. Consequently, an osculating plane to the 
ellipsoid remains constantly oriented in space perpendicular to the 
angular momentum vector. 

Now take vector co drawn towards the point of contact of the plane 
and the ellipsoid. Its projection on the direction of the normal to the 
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ellipsoid is 

d = (o { cos a i + co 2 cos a 2 + oo 3 cos a 3 = 2 

i 

2 T 

= nrr- = constant 
M 

This is the projection of a vector, drawn to the point of contact, 
on the direction of the normal to the surface of the ellipsoid at that 
point. Hence, it is also the projection of vector (0 on the direction 
of the normal to the osculating plane or, what is the same thing, 
the length of a perpendicular from the origin of the coordinate system 
to the osculating plane. 

Thus, a plane perpendicular to the total angular momentum and 
tangent to the ellipsoid whose equation expresses the conservation 
of energy, remains at a constant distance d from the origin of the 
coordinate system along the axes of which are laid off the segments 
(On co 2 , and co 3 , that is, the projections of the angular velocity of the 
body on the moving axes. The ellipsoid, so to say, rolls along a spa¬ 
tially fixed plane, and the radius vector drawn to the point of con¬ 
tact of the ellipsoid and the plane states the instantaneous value 
of the angular velocity in both magnitude and direction. 


Free Symmetric Top. As pointed out before, the problem on the 
rotation of a symmetric top whose point of support lies on the axis 
of symmetry has an exact solution. Mathematically it is rather com¬ 
plex. On the other hand, the solution of the problem of a free sym¬ 
metric top is quite simple. We shall obtain it in the way just elabo¬ 
rated for the case of an arbitrary free top. 

Suppose that / 2 = / 3 , that is, the ellipsoid whose equation ex¬ 
presses the conservation of energy is an ellipsoid of rotation. It is 
shown in Figure 11, where the osculating plane has, quite arbitrarily, 
been chosen horizontal. For the sake of making the drawing clearer, 
the angular velocity vector has been drawn away from, rather than 
towards, the point of contact of the ellipsoid and the plane. This 
simply means that an identical osculating plane is presumed drawn 
above the ellipsoid. Furthermore, the vectors have not been drawn 
to scale and are somewhat exaggerated, so that vector w should be 
pictured as drawn only up to the surface of the ellipsoid, and its 
projections reduced accordingly. 

It is apparent from Figure 11 that angle 0 between the axis of 
symmetry and the direction of the normal to the plane, that is 
the direction of the total angular momentum, remains unchanged, 
because the point of contact describes a circle in a plane perpendicular 
to the axis of symmetry. Angle 0 is determined from the equation 

cos 0 = I^JM 



102 


Fundamental laws 


The angular velocity component along the axis of symmetry 
remains constant. This means that the axis of symmetry revolves 
uniformly about the direction of the total angular momentum, 
which is constant in space. Such motion is known as precession . 

Since precession takes place about the vector of total angular 
momentum, its angular velocity co p must be directed along M. 
The component along the axis of symmetry has no relationship to 
precession. As can be seen from Figure 11, co 2 = co p sin 0. But the 



Figure 11 

projection of the total angular momentum on axis 2, which is as¬ 
sumed constant in the plane of the drawing (and not fixed to the 
body of the top), is equal to M 2 = I 2 co 2 = M sin 0, whence 

co p - MU I (9.19) 

If the point of support of the top is not at the centre of mass, the 
total angular momentum is not conserved, only its vertical compo¬ 
nent is. At sufficiently high angular velocities the top performs oscil¬ 
lations in the vertical plane, which superimpose on the precession. 

The Euler Angles. We shall now show how to describe the rotation 
of a rigid body in space in terms of the parameters defining its posi¬ 
tion. For this it is useful to draw two coordinate systems: one fixed, 
Oxyz, and the other fixed with respect to the body, Oxy'z', usually 
in such a way that the latter’s axes coincide with the principal axes 
of inertia at the given point. Then the position of the moving 
coordinate system relative to the fixed one is uniquely given by 
the three Euler angles (Figure 12): 

*0 is the angle between axes Oz and Oz'\ 

(p is the angle between the intersection line OK of planes xOy 

and x'Oy f and the x' axis; 
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y\> is the angle between OK and the x axis. 

As we know, the angular velocity of rotation is laid off as a seg¬ 
ment directed along the rotation axis, that is, perpendicular to the 
plane of rotation. Line OK is perpendicular (by the construction) 
to the z and z' axes, hence it is perpendicular to the plane through 
them. The angle of rotation d is laid off in this plane, so that the 

angular velocity d is directed along OK. Similarly, we see that the 
angular velocity <p is laid off along the z ' axis, and 1 J 5 along the z axis. 



Now express the angular velocity projections co 1 , co 2 , and co 3 on 
the principal axes of inertia Ox ', Oy r and Oz ' in terms of the general¬ 
ized velocities ip, <p, and d. The projection co 3 is that of the angular 
velocity on the z axis (the third axis). As pointed out, <p is projected 

fully on this axis, while the projection of is equal to cos'd 1 , since 
d is the angle between the z and z ' axes. Hence 

co 3 = <p + ^ cos d (9.20) 

In order to find the projections of the angular velocity on the 
other two axes we mentally draw a line OL in the plane x'Oy' per¬ 
pendicular to OK (OL is not shown in Figure 12). Then 

z LOx’ = -2.-<p and ^ z OL=-^- + ® 

since, like all lines perpendicular to OK , OL lies in the z,z'-plane. 

The projection of on OL is equal to — ^ sin d, and the projection 
on Ox 9 is equal to —sin d cos (jt/2 — (p), or —sin d sin q). 
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The projection of \|) on Oy' is sin 0 cos q). The projections of -ft on 
Ox' and Oy' are apparent from the diagram: they are 0 cos q) and 
ft sin q). The result is therefore 

(o i = b cos q) — \|) sin ft sin q) (9.21) 

cd 2 = ft sin q) +'ll) sin ft cos q> (9.22) 

From Eqs. (9.20)-(9.22) we see that co 1? co 2 , and co 3 are not total 
time derivatives of any quantities and, in that sense, do not exactly 

agree with the usual notion of generalized velocities (as do q), 'ft). 

This was discussed in connection with Eq. (9.7). 

If we substitute into (9.12) the expressions for o^, co 2 , co 3 in terms 
of the Euler angles, we obtain the kinetic energy of a rigid body as 

a function of the generalized coordinates q), ft and velocities q), 

\j), ft. 

A Symmetric Top in a Gravitational Field. We shall find the 
Lagrangian for a symmetric top whose point of support lies on the 
axis of symmetry at a distance l below the centre of mass. If the 
top is inclined at an angle ft to the vertical, the height of the centre 
of mass above the point of support z, is l cos ft. Hence, the potential 
energy of the top is 

U = mgz = mgl cos ft (9.23) 

The kinetic energy of the top, expressed in terms of the Euler 
angles, is 


T=±I t + 

= / 4 (ft 2 \|) 2 sin 2 ft) + I 3 (<p + cos ft) 2 (9.24) 

The difference gives the Lagrangian for a symmetric top. But 
since it does not contain time explicitly, the total energy E = T + U 
is an integral of motion: 

E = T + U = constant 

We can find two more integrals of the motion, noting that the 
angles q) and ^ do not appear explicitly in the Lagrangian, that is, 
q> and ty are cyclic coordinates (q) is eliminated only in the case of 
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a symmetric top, and i|) is not involved in T at all). We obtain 
= 73(9 + 1 ]) cos ft) = constant (9.25) 

dtp 

dL • • • 

= —— = 7 t sin 2 fti|) -f- I 3 cos ft (cp + ij) cos ft) = constant 

dip 

(9.26) 

If we eliminate 9 and ip from the last two equations and substitute 
them into the energy integral, the latter will contain only the 
variable ft, which allows us to reduce the problem to quadrature. 
Substituting (9.25) into (9.26), we obtain 

• 

= /j sin 2 fti|) + p<p cos ft 


whence 


^ = 7 lS in 2 0 (P*~P* cos ft) 

The energy integral, after substituting 9 and ip, is 


E=^I^- 


(P$ — Pq> C0S ft) 2 


21 1 sin 2 ft 


2/o 


- mgl cos ft 


(9.27) 


Thus, the problem is reduced to motion as it were with one degree 

of freedom ft. The corresponding “kinetic energy” is I$ 2 I2, and 
the “potential energy” is represented by those energy terms which 
depend on ft. This potential energy becomes infinite for ft = 0 and 
for ft = jt. Hence, for 0 <ft < Jt it has at least one minimum. If 
this minimum corresponds to ft < jt/2, the rotation of a top whose 
centre of mass is above the point of support is stable. Small oscilla¬ 
tions are possible around the point of stable equilibrium, which 
is in this case the dynamic point of equilibrium. These oscillations 
are superimposed on the precessional motion of the top, which we 
have already noted. They are called nutations. 


EXERCISES 

1. Knowing the integrals of the motion p^ and determine the angle 
at which the amplitude of nutations may vanish, that is, the top rotates in 
a gravitational field like a free top (pseudoregular precession). 

Solution. From the energy integral (9.27) we find the essentially positive 

quantity 7 1 d ,2 /2. In nutations angle ft varies but slightly near the position 
of minimum “potential” energy. If the total energy E corresponds to the 
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minimum “potential” energy position, both points of intersection of the 
line E = constant with the “potential” energy curve merge. Differentiating 
equation 


E 


—Apcos $) ! 
21 1 sin 2 d 


p 2 

ttt- mgl cos d =0 

z/3 


with respect to ft, equating the derivative to zero, and solving simultaneously 
with the energy equation, we obtain the equation for the angle of pseudo- 
regular precession and E: 

3 mgl cos 2 ft — ( 2 E— -f- ) cos ft — mgl -j- = 0 

x 1 3 M ' M 


2. The axis of a symmetric top (gyroscope) can rotate only in the hori¬ 
zontal plane. Determine its motion, taking into account the effect of the 
earth’s diurnal rotation. 

Solution. Resolve the vector of the angular velocity of the earth’s 
rotation at the given point of the globe into two components: vertical ov = 
= (o sin 0 (where 0 is the latitude of the location) and horizontal, tangent 
to the meridian, 0)^=0) cos 0; the| latter, obviously, is directed to the 
north. Let the gyroscope’s axis make an angle cp to the meridianjin the hori¬ 
zontal plane, and denote the angular rotation of the gyroscope about its 
axis as (o 0 . Since two moments of inertia of the gyroscope are equal, one 
of the principal axes can be assumed constantly directed vertically, and 
the other horizontally. Though the gyroscope moves relative to them they 
do not lose the property of principal axes, because they remain constantly 
in the plane of equal moments of inertia. 

The projections of the angular velocities of rotation of the gimbals of 
the gyroscope (which keep its axis in the horizontal plane) on the prin* 
cipal axes of inertia thus chosen are: 

(Oj = (o v + q>, (o 2 = (Oh sin q>, o) 3 = (Oh cos <p| 

From this we obtain the angular momentum components along the 
principal axes of inertia (the angular velocity o> 0 of the gyroscope itself was 
not taken into account in o) 3 , because the referencefframe is fixed relative 
to the gimbals!): 

M 1 = /1 (o) v + <p), M 2 = /i(D h sin <p 
M 3 = 1 3 (co 0 + (Oh cos <p)| 

Now we make use of Eqs. (9.14) to vary the components of M with time. 
The torque acts only relative to the second axis, due to the reaction of the 
support that keeps the gyroscope’s axis in the horizontal plane. There is no 
need to write this equation. The equation for the first component of angular 
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momentum yields 


dM i 
dt 


(0 2 Af 3 — (O3A/2 


= /i 


d * 2 


■f Iz&h 3in 9 (a>o h G>h cos (p) — /icog cos q> sin q> = 0 


Neglecting the small quantity of the order of the square of the earth’s angular 
velocity, we arrive at an equation that coincides with the equation of pen¬ 
dulum oscillations, which follows from ( 3 . 22 ): 
d 2 © , / 3 

~di? + -77 sin = 0 

At small angles of deflection from the meridian, sin <p can be replaced 
by cp, and the oscillation period is equal to 


2n 


__[i _\ 1/2 

/ 3 G) 0 G) cos 0 / 


This is the principle of action of the gyroscopic compass which, unlike the 
magnetic compass, points straight north. 

3. Using Euler’s equations (9.16) and the kinematic equations (9.15)* 
obtain the first integral of the motion for a Kovalevskaya top. 

Solution . Let the point of support of the top lie in the plane of equal 
moments of inertia at a distance l from the centre of mass. Denoting the 
projections of the vertical direction on the moving axes n u n 2 , and n 3 , we 
find that the torque due to the force of gravity relative to the centre of mass 
has components 0, — n 3 (mg) 1 / 2 , and n 2 (mg) 1 / 2 . We write the kinematic 
equations for n u n 2j and n 3 . 


dn i . A 

dt "f ®2 W 3 — <*>3 W 2 == 0 

dw 2 . n 

-5p + 0) 3 w i— ©1«3=° 

dn 3 

— © 2 W 1 — 0 


We write Euler’s equations as follows: 


2 

2 


d(D 1 

~dT 

d( 0 2 


— CO2CO3 = 


dt 


-(i)i(i) 3 = —3 


d( 0 3 

dt 


= \i z n 2 , 



Multiply the second equation by i, add to the first, and multiply the result 
by (Oi + io ) 2 to getj 


d 

— (©! + i(o 2 ) 2 = — ico 3 (coi + i(o 2 ) 2 — ip 2 /i 3 (©1 + ico 2 ) 
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Multiply the second kinematic equation by i and add to the first to get 
d 

— (ni + in 2 )= — i(o 3 (n l + in 2 ) — in 3 ((0i + i(0 2 ) 

Finally, multiply both sides of the obtained relationship and subtract from 
the equation containing the derivative ( dldt) ((Oj + io) 2 ) 2 . As a result we 
obtain 

d 

[(cot + i(0 2 ) 2 — \i 2 (n i + in 2 )] = — ico 3 [((o t + i(0 2 ) 2 — \i 2 (w 4 + in 2 )] 

or, denoting 0 = (coi + ico 2 ) 2 — M< 2 (wi + m 2 ), 
dQ . 

-5T=- ia> 3 0 

The equation for the complex-conjugate quantity 0* is derived in quite 
the same way: dQ*ldt = io) 3 0*. But then, after multiplying the equation for 
0 by 0*, and the one for 0* by 0, and adding, we obtain the integral of 
motion: 

jL. 00* = 0, | (coi + io) 2 ) 2 — p- 2 (rt 4 + in 2 ) | 2 = constant 


10 


HAMILTON’S EQUATIONS 

AND THE HAMILTON-JACOBI EQUATION 


Up till now we used the Lagrange set of equations as the equations 
of motion. Making use of the concept of generalized momentum p al 
we can write these equations as follows: 


dp a dL 

dt dq a 


( 10 . 1 ) 


Pa = 


dL 

dq a 


( 10 . 2 ) 


The generalized coordinates and momenta are involved non- 
symmetrically. In some cases it is convenient to have a symmetric 
set of equations. It is neither better nor worse than the Lagrange 
equations for practical solutions of mechanical problems, but is of 
much greater importance in general research. 


Hamilton's Equations. The Lagrangian depends on the generalized 
velocities quadratically, so that Eqs. (10.2) are linear with respect 



Mechanics 


109 


• • 

to all the q a ' s. Such equations can always be solved and q a expressed 
in terms of the generalized momenta. 

Let us now form the expression for energy (see (4.1)): 

• ft T • 

E — qa : Z/(... q a ..., ... q a •••) 

d ( la 

and substitute into it all the g a ’s expressed in terms of all Pp’s. 

The energy expressed in terms of the generalized coordinates and 
corresponding momenta is called the Hamilton function , or simply 
the Hamiltonian , of the system: 

£'(... q a ..., ... q a (Pfi) ...) = $8(Pi q) = q<x.Pa — L 

(10.3) 

If, for example, we replace *0 in expression (9.27) by p$II x , we 
obtain the Hamiltonian of a symmetric top in a gravitational field. 

To develop the required set of equations we write the expression for 
Hamilton 1 s principle , or the principle of least action , expressing L 
in the integrand in terms of H: 

U 

8 S=:8 j lPaqa-SS(p, q)\dt = 0 (10.4) 

*. 

Here all the generalized velocities have been replaced in terms of 
the generalized momenta. Obviously, for an integral to be extremal 
along an actual path it cannot depend on the variables in terms of 
which it is expressed. 

Let us calculate the variation 85: 
h 

85 = j (8^ a + p a 6^ a —^-6p a —g.«g a )* = 0 

The second term in the parentheses can be integrated by parts as 
was done in (2.17). Then 

+1 

*0 

-8?«(p«+|g-)] = 0 (10.5) 

If we impose the condition that the varied path always passes 
through the given endpoints, the integrated part vanishes at the 
integration limits. At those points 8g a = 0. Under the integral 
sign q a and p a are independent variables, the variations of which 
are absolutely arbitrary. Reasoning in exactly the same way as in 
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deriving the Lagrange equations in Section 2, we conclude that the 
variation 85 can vanish only if the following equations are satisfied: 



d&e 

d<la 


Qa = 


'd$e 

dPa 


(10.6 a) 
(10.66) 


Instead of n second-order Lagrange equations we obtain 2 n first- 
order equations. They are called Hamilton's equations . 

If SS does not depend explicitly on time, the latter can be eliminat¬ 
ed by dividing all the equations by one of them. For the sake of 
simplicity we shall show how this is done for a system with one 
degree of freedom: 


dp doff I d&€ 

dq dq / dp 


(10.7) 


Integration yields one arbitrary constant. The second constant is 
determined from the quadrature 
dt _ 1 

dq doft/dp 


where Sfl is a function of q and p in which p is expressed in terms 
of q with the help of the integrated equation (10.7). The integration 
constant of the last equation is simply the initial time £ 0 , which 
can be legitimately put equal to zero without loss of generality. 

Note that we cannot substitute the momenta into the Lagrangian 
as we did into the energy to make use of all the known integrals of 
the motion (cf. (9.27)). Substitution of the momenta into the 
Lagrangian yields incorrect equations of motion for other variables. 
However, the following procedure can be adopted. 

Let a certain generalized coordinate q a be cyclic. We express the 
corresponding (constant) momentum and solve the equation for it 

with respect to q a . If we form the function 


R=-p 0 qa — L\ (10.8) 

(there is no summation over a), then with respect to q a it will be 
equivalent to the Hamiltonian, while remaining a Lagrangian for 
all the other variables. In other words, we must write Eqs. (10.6a) 
and (10.66) for p a and q 0 , and the first one immediately yields p 0 = 
= constant. The equations of motion for all the other variables 
are formed with the help of R according to the rule (2.20). The func¬ 
tion R is known as the Routh junction. 


Canonical Transformations. On many occasions we had to change 
the dynamic variables, in going over, for example, from orthogonal 
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to spherical coordinates. In such transformations the Lagrange 
equations do not, naturally, alter their form. They are written in 
the same way on the basis of the variation principle. 

Now we must consider transformations of a more general type, 
affecting not only the generalized coordinates but the generalized 
momenta as well, insofar as both, as was shown, can be considered 
to be symmetrical with respect to one another. But there is one 
fundamental requirement for this: the transformed coordinates and 
momenta must satisfy requirements of the same form as the initial 
ones, that is ( 10 . 6 a) and ( 10 . 6 b), though, perhaps, with a different 
Hamiltonian. Hamilton’s equations are sometimes called canonical , 
and the required transformations are therefore also known as ca¬ 
nonical. 

And so, assume that the old dynamic variables p a and q a are 
expressed in terms of new ones, P $ and Q$, for which the Hamiltonian 
is not SB but < 2 T. 


If the old Lagrangian was equal to p a q a — SB , then the new one 

must have the form P$Q$ — But we know from Section 2 that 
the Lagrangian is defined only up to the total derivative of some 
function of the coordinates an<} time. Let that function V be depend¬ 
ent on both the old and the new coordinates. Then the relationship 
between the two Lagrangians (the old and the new) is 


Pa 


}dq n 

L dt 


S6 = P& 


dQfi 

dt 


/W 

( 2 ft 


dV (</, Q) 
dt 


_ p We . dV dg a dV Wb dV 

p dt ' dq a dt ' dQ 6 dt dt 

(10.9) 


Since all dq a 's and dQ 3 ’s are independent variables, Eq. (10.9) 
is valid only when their factors are equal. From this we obtain 
equations which must satisfy the old and new variables for the 
transformation from the one to the other to be canonical: 


Pa = 


Pb = 


dV 

d<la 


dV 


dQ B 


m=yc- d ± 


( 10 . 10 ) 

( 10 . 11 ) 

( 10 . 12 ) 


It was pointed out before that the coordinates and momenta are 
symmetrically involved in Hamilton’s equations. We can therefore 
write transformation formulas which would contain, instead of old 
and new coordinates, old coordinates and new momenta, or old 
momenta and new coordinates, or old momenta and new momenta. 
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Let us show, for example, how to pass from transformations (10.10)- 
(10.12) to transformations containing q a and instead of q a and Ob- 
We put 

V = V' - PiQi, r = V' (q, P) (10.13) 


Then the first term in the right-hand side of (10.9), that is 
P $ {dQ$ldt), cancels out with the corresponding term from the 
transformation function, and we obtain the equation 




dV' dq a 
dq a dt 


. dV’ dPfi 
' dP & it 

From it we find the other transformation 


a dp o | *ZL 

•ve—+ dt 
formulas: 


(10.14) 


Pa 


dV’ 

dq a 



&8 = W 


dV' 

dt 


(10.15) 

(10.16) 

(10.17) 


The method of obtaining the other formulas is obvious. 


The Hamilton-Jacobi Equation. Suppose we have managed to find 
a transformation function V ( q , Q) such that the new Hamiltonian &C 
is identically equal to zero. Then, from (10.6a) and (10.66), the 
new dynamic variables satisfy the following equations: 

dp t _ d&e _q 
dt ~ dQz "" u 

_ d$C _^ 

dt “ U 

In other words, both and Q p are constant, and since solution 
of mechanical problems requires finding 2 n constants, the function 
V ( q , Q ), which makes the new Hamiltonian vanish, immediately 
yields the required solution. This function satisfies the equation 

= < 10 - 20 > 

This can be verified if we put &C = 0 in (10.12) and substitute the 
generalized momenta p from (10.15) into it. 

The first-order partial differential equation (10.20) in n variables 
isjcalled the Hamilton-Jacobi equation . 

There is no need to seek a general solution of (10.20) containing 2 n 
arbitrary functions. It is sufficient to find the so-called total integral 


(10.18) 

(10.19) 
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of the equation, which involves n arbitrary nonadditive constants C t . 

If such an integral has been found, the stated n constants can be 
taken as the new coordinates, that is, we can put 

Qi = Ci 


Then from (10.11) we obtain the new momenta: 


Pi = 


dV 

OQi 


dV 

dC t 


( 10 . 21 ) 


As can be seen from (10.18), these new momenta are also constant 
quantities. Thus, it is sufficient to find a solution of the Hamilton- 
Jacobi equation containing half of all the integrals of the motion. 
The rest are determined by simple differentiation. 

Of course, the Hamilton-Jacobi equation can usually be solved 
in closed form only in those cases when the solution can be found 
in some other way. But the computations required to find the equa¬ 
tions of motion are greatly simplified with the help of the Hamilton- 
Jacobi equation as compared with other methods of solving problems 
of mechanics. Besides, from the specific form of Eq. (10.20) it is 
easier to^conclude whether it can have a solution in closed form. 


Action as a Transformation Function. One possible transformation 
function is the action S of a system. To demonstrate this, consider 
Eq. (10.5), which expresses the total variation of the action. It was 
pointed out in Section 2 that the difference between the variation 
and the differential of a coordinate is that the former is arbitrary, 
while the differential is taken along the actual path. Let us now 
refer Eq. (10.5) to just that case, that is, actual motion. Then, since 
Hamilton’s equations (10.6a) and (10.66) are always satisfied for 
actual motion, the integrand vanishes. At the integration limits 
the variations 8S become the differentials of those quantities taken 
along the paths, so that the action variation 8S can be replaced by 
dS . We thus obtain 

dS = p a dq a — Pa dqa (10.22) 

But since the motion of the system has now been defined, the 
action S can be treated as a function of the running coordinates q a 
for the given initial values This corresponds to the following 
construction. At the initial time we have a continuous set of systems 
of the same type differing only in the initial conditions of motion. 
In other words, the systems have the same Hamiltonians and at the 
initial time as it were fill a smooth surface so that each point of the 
surface corresponds to a certain set of initial conditions of motion. 

With time each system develops its running values of the 
generalized coordinates in accordance with the dynamic laws and 
initial conditions. For these coordinates we can again construct a 

8—04 52 
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surface similar to the initial one, but on it the value of the action 
will be different from what it was on the initial surface for the given 
system that “started” at time t = 0. 

However, if w r e begin to “eject” systems from the initial surface in 
continuous succession, then at each instant there can be found a 
surface on which the value of the action is the same as it was at time 



Figure 13 


t = 0 for systems “launched” at the beginning. Consequently, a 
constant-action surface moves in a q a generalized-coordinate space 
without being fixed with respect to one and the same particles (we 
shall demonstrate this later with the simple example of free particles). 

Consider the propagation of the surface of equal action shown 
schematically in Figure 13 in two-dimensional space. Since the 
action is now calculated along a known path, it is a function of q a 
and t as well as of the point on the initial surface through which the 
given path passed. If S = S (g a , q&, t), then for the given time t 
the total differential dS must be 


dS -(Tt)/^ + (^), d<A (10 ' 23) 

From a comparison of (10.22) with (10.23) we conclude that 


Pa = 



Pa= — 



(10.24) 

(10.25) 


where the subscript of the parentheses indicates that all the variables 
from the given totality of q a or q° a are assumed constant in the 
differentiation. 



Mechanics 


115 


Furthermore, we also calculate the partial derivative dS/dt. 
The total derivative dS/dt is equal to the Lagrangian (according 
to the definition of action). The partial derivative differs from the 
total derivative in the following way: 

dS dS dS dq a 

dt dt dq a dt 

Substituting p a from (10.24), we obtain 

— L — Poka = — $8 (10.26) 

But Eqs. (10.24)-(10.26) coincide with (10.15)-(10.17), if we put 
the new Hamiltonian <5T equal to zero in the latter. Hence, action is 
a transformation function which satisfies the Hamilton-Jacobi 
equation. The integration constants in this case are the initial 
coordinates, while the other n constants are, according to (10.21) 
and (10.25), the initial momenta. 

Thus, the displacement of a system along a path of actual motion 
performs a continuously developing canonical transformation from 
the running variable coordinates and momenta to the constant 
initial coordinates and momenta. 

Let us now find the speed with which the surfaces of constant 
action propagate through space. For clarity take a system with one 
degree of freedom. We differentiate the action and require the differ¬ 
ential to vanish, since we want to find out how one and the same 
value of S propagates: 

i p dS , , dS 

dS = w dq + -^dt 

Instead of the partial derivative of the action with respect to time 
we substitute the energy with a minus sign, and instead of dSIdq , 
the momentum. Hence the required velocity, dqldt, is 

dq _ E 

dt p 

But from (10.6b) the velocity v of the particles themselves is 
equal not to the quotient Elp but to the derivative dEldp. For 
example, in the case of free particles the surface of constant action 
propagates with the speed mv 2 /(2mv) = vl2 , that is, half as fast as 
the particles themselves. 

Integration of the Hamilton-Jacobi Equation. As an example let 
us examine how the Hamilton-Jacobi equation is integrated in 
a specific case, Kepler’s problem. As was shown in Section 3 (see 
(3.11)), the Lagrangian in this problem has the form 

L = (r 2 + r 2 0 2 -f- r 2 sin 2 ^ cp 2 ) — U(r) 


(10.27) 


8 * 
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Let us now find the Hamiltonian. The linear momenta are express¬ 
ed in terms of the velocities as follows: 


p T = mr, p# = mr 2 ft, p 9 == mr 2 sin 2 0 <p 

From this, expressing the generalized velocities in terms of the 
respective momenta and replacing in the Hamiltonian the quantity 
— U involved in the Lagrangian by £/, we obtain 

+ < 10 - 28 > 


To obtain the Hamilton-Jacobi equation from this we replace the 
momenta by the derivatives of action with respect to the correspond¬ 
ing coordinate, according to the general rule, and equate the 
expression to —dS/dt. Instead of the transformation function V 
we write everywhere S. Then 


2m L \ dr / ’ r 2 \ dft / " r r 2 sin 2 ft 


/ dS \2-l , TT/ . dS 

(*p) J+^(0= gf 

(10.29) 


First of all we must eliminate the variables not explicitly involved 
in the equation, that is, time t and azimuth cp. This is done as 
follows. Instead of S we introduce a new function S 0 , the so-called 
“shortened” action: 

S = — Et + + S 0 ( r , 0) (10.30) 


Substituting Eq. (10.30) into (10.29), we obtain the equation for S 0 : 


1 / dSp \ 2 | 1 ( dSp \2 

2m L \ dr I ' r 2 \ d$ / 


p 2 

r 2 sin 2 ft 


] 


+ U(r) = E (10.31) 


To integrate Eq. (10.31) we seek a solution in the form 

s 0 = R(r) + m 


We substitute this solution into (10.31) and transpose the terms 
involving the variable *0 to the left-hand side of the equation, and 
those involving energy to the right-hand side (though the latter 
is not essential). As a result, on the left we have a function dependent 
only on the polar angle. But r and d are independent variables. 
We can, for example, vary r without changing Then the left-hand 
side of the equation will change while the right-hand side remains 
unchanged, or vice versa. Such a situation may occur in one and 
only one case: when both sides of the equation are constant and do 
not depend on either r or d. Denoting their value M 2 , we obtain 
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two equations: 



Integrating both with the help of quadratures and combining 
the result in Eq. (10.30), we obtain the total integral of the Hamil¬ 
ton-Jacobi equation, which involves three arbitrary constants: 

s--i* + /vp+j id [w ! — 

+ ^dr\im(E-V-^]\ n (10.32) 

From this we can determine other constants by differentiating 
with respect to the already known constants E, p <p and M. The 
meaning of the first two new constants can be easily discerned from 
the answer: they are the initial time, and the initial azimuth taken 
with a minus sign. Indeed, after such substitutions the path equa¬ 
tion is reduced to the form 


m dr 


1 to i [2 m(E — U) — M 2 /r 2 ] U2 


> (M 2 — p|/sin* ft) 1/2 
dS _ _f Md& 

dM a ~ J (M 2 — p|/siD 2 ^ 1/2 


(M 2 — p<p/sin 2 d) 

-I; 


M dr 


*[2m(£ — U) — M 2 /r 2 ] 1/2 


(10.33) 

(10.34) 


(10.35) 


The integrals contained in (10.33) and (10.35) can be found by 
substituting the specific form of the dependence of the potential 
energy on the distance to the centre, U — U(r). In practice the 
most important case is Kepler’s problem, when U = — air . Then 
(10.33) expresses the time dependence of the radius r, (10.34) connects 
the azimuth and the polar angle, and (10.35) gives the dependence 
of the radius on the polar angle. 

This type of solution is conveniently employed in cases involving 
several bodies moving about an attractive centre in different planes. 
Then the obtained solution holds as long as the interactions between 
the bodies are not taken into account. When perturbations are taken 
into account, such a solution yields a zero approximation. 


Action Variables and Angular Variables. Let us now examine a solu¬ 
tion of the Hamilton-Jacobi equation as applied to the problem 
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of finite motion of a system. An example of such a solution is (10.32) 
when the potential energy U(r) refers to the forces of attraction, 
and the total energy E < 0 at U(oo) = 0. 

In finite motion in a central field r, d and (p vary within finite 
limits, which are set for r and 0 from the conditions that the radi- 
cands in (10.32) must be positive; (p varies from 0 to 2 ji. It can be 
seen from Eq. (10.24) that 

p r - [2m (E — U) — M z !r 2 } 1/2 

Pa = (M 2 — /^/sin 2 d ) 1/2 

These momenta have real values for given E, M and p, p when the 
radicands are positive. 

Suppose the motion is finite over all variables. Then any of the 
integrals 

J i=i ( 10 - 36 ) 

converge over the whole domain of variations of p t and q t . The inte¬ 
gration domain commences from any value of q t , proceeds to the 
limit of the variation of q t on one side (where the integrand vanishes), 
then back to the other limit, and to q t again. In other words, it is 
the area of the curve p t = p^g*) drawn in the plane in which p £ , 
q t are laid off along the coordinate axes ( 2 jt is introduced into the 
definition so as to satisfy the equation 

2ji 

Jy = ^ j" Pq> d(p = p<p 
0 

that is, so as for J^ to simply coincide with p^; for uniformity the 
same factor is involved in all the J h ). 

Suppose the integrals J h have been calculated for all variables. 
Obviously, they depend only on the first integrals of the motion, 
which determine the values of all the generalized momenta p a . 
For example, in the case of Kepler’s problem / r , J§, and J^ are 
expressed in terms of E, M, p <p. Let us now express the first inte¬ 
grals in terms of the quantities J k . We obtain, in particular, the 
energy as a function of all the J k ' s: 

E = E (Ji, J 2 , /*, ..., J n ) (10.37) 

We substitute the integrals of motion expressed in terms of J h 
into an action function analogous to (10.32), but not involving 
time explicitly, that is, one into which the term — Et has not been 
introduced. Then, according to the general theory of canonical 
transformations, we obtain a transformation function from the 
variables q a to other, also canonical, variables. Let J k be the gen- 
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era]ized momenta in terms of the new variables. The transformation 
function now has the form 

S = St (q a , Jk) (10.38) 

Since time is not involved explicitly in (10.38), the new Hamil¬ 
tonian is, in accordance with (10.17), equal to the old one: 

SfC = m = E{. ../*...) (10.39) 

But this new Hamiltonian depends only on the generalized mo¬ 
menta J h , so that by applying Eqs. (10.6a) and (10.66) we obtain 
Hamilton’s equations for the momenta J h and the corresponding 
coordinates, which we denote w k : 

•'•--•sir 0 (,0 - 40) 

* /JP 

w k = ~Qj r ^ = - constant (10.41) 

The integration of (10.40) once again confirms that the J k are 
constants of motion. From (10.41) it follows that the variables w k 
vary linearly with time: 

w h = -~t + w° k (10.42) 

The quantities w k and J h are known as angular variables and 
action variables. The designation for w k was chosen because these 
variables increase like an angle in uniform rotation. 

Let us now follow the change of one angular variable due to the 
passing of the corresponding generalized coordinate over the whole 
permitted range of values in both directions. This is not actual 
motion, when all the generalized coordinates change; what we are 
doing is to fix the values of all the coordinates but one, which we 
vary for a given value of all the integrals of the motion 

Obviously, when a varying coordinate passes through all possible 
values and returns, together with the corresponding velocity, to the 
initial value, only the angular variable w k corresponding to it re¬ 
ceives an increment. The action increment in this imaginary cycle is 

2jt/ ft , since over one cycle j p t dq t is equal to 2nJ t (see (10.36)). 

But according to (10.16), to such a variation of the action there must 
be correlated a change in the angular variable A w k , connected with 
it by the equation 

A w k = T 7 - A S k = 2nJ h = 2n 

* dJ h * dJ k * 

We find the time interval x h necessary for this cycle. From (10.42) 
we have 
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To this period of change in the coordinate there corresponds a fre¬ 
quency (a* equal, according to the general definition, to 2 ji/x*, 
that is 



If only the coordinate q k were changing, it would, bearing in 
mind the period of change x ft , be dependent on time according to 
the law 


g ft = Re[ 2 4 h „exp(wco ft i)] (10.44) 

Actually, though, all the q k 's depend on all the w t 's so that instead 
of the special law (10.44) we must write a more general one: 

g h = R&[ 2 An l n I ...n h exp(i2”ft“ft0l (10.45) 

h 

where the integers n k acquire any possible values. 

Then, by “stopping” the time for all variables but one, we revert 
to (10.44), that is, to a periodic dependence, while in reality we 
have a dependence of more general form, Eq. (10.45). 

If the frequencies co* are incommensurable, then all the linear 

combinations 2 n h®h with integral n h are different. Hence, for no 
value of t does (10.45) revert to the initial value it had at t = 0. 
If we wait long enough, however, it can come infinitely close to the 
initial value. The coordinate q k is not periodic but is, as they say, 
an almost periodic, or quasi-periodic, function of time. 

Sometimes, though, the frequencies are commensurable. In that 
case after a time interval that is a multiple of all periods the system 
reverts to its initial state, and its path closes. An example of a 
closed path is the elliptical path in Kepler’s problem, in which all 
three periods x r , x<>, and x<p are the same. Another such case is a two- 
dimensional harmonic oscillator with equal oscillation frequencies 
in two directions. 


Adiabatic Invariants. Suppose now that a system is subject to 
some external action dependent on time (for example, the Hamilto¬ 
nian involves a parameter which gradually changes with time), but 
we accept that in any period x ft the parameter X varies but slightly. 

If the Hamiltonian of the system involves time explicitly, the 
transformation function, S i, to the variables J k must also be depend¬ 
ent on time. Given these conditions, the old Hamiltonian is not 
equal to the new and is, according to (10.17), connected with it by 
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the relationship 

&[ = £(..= (10.46) 

Here, at very slow variations of X, the derivative (dSJdX)j can 
be referred to a constant X, and (10.46) can be treated as a Taylor 
expansion of the Hamiltonian involving only the first-order term 

in X. 

If the Hamiltonian varies with time, its derivatives 


d9C __ / d*S \ : 
div h \ dwk dX ) j 


(10.47) 


must also depend on time. 

We shall show, however, that in a sense J h varies much slower 
than X . For this we introduce the concept of the mean of a certain 
quantity / ( t ) with respect to time: 



(10.48) 


Suppose that the time interval t is very great in comparison with 
all the periods x* of the system, but very small in comparison with 
the time of appreciable variation of the parameter X. Or, if we write 

this condition in the form of strong inequalities: A X = Xt <^X 

and (x) h t > 1, we can, in averaging Eq. (10.47) over time, take X 
outside the integral. We obtain the following equation: 


As can be seen from Eq. (10.46), the derivative ( dS/dX)j is taken 
at constant J k . Therefore, if the time varies by any integral period 
T k (when to S is added 2nJ k ), the derivative receives no such incre¬ 
ment, remaining a quasiperiodic function. But over a sufficiently 

long time interval the mean value of the function exp ^ i ^ ^ 00 ^) 

k 

tends to zero, because 

t exp (/2 "ft®**)”* 

lim T [ ex p( f ‘ 2 n k (o h t)dt = --= 0 

i_>00 0 'ft 7 * 2 j 

h 

Consequently, the mean variation of the quantity [J k ( t) — J k (0)]/£ 
is substantially less than the variation of X in the same time. In the 
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limit, assuming X to be an infinitesimal of the first order, we find 
that the variation of J h is an infinitesimal of lower order. In other 
words, J k should be considered as a constant quantity. 

This is an important property of J h that distinguishes it from 
energy, which varies accordingly as A,, or, in other words, is not 
conserved. The arbitrarily slow variation of a quantity in a system 
is known as adiabatic variation. Since J k is conserved in such varia¬ 
tion, it is called an adiabatic invariant. 

Let us calculate it for a linear harmonic oscillator. According 
to the general definition (7.31), the energy of a separate oscillator is 


E = 


P 2 
2 


( 0 2 <? 2 

2 


(10.49) 


whence the adiabatic invariant is 


V2I/0) 

J -b J_ -Mi=-§r i 10 - 50 * 

-V 2 E/ (0 

Suppose that as a consequence of a variation in the elastic con¬ 
stant of an oscillator its frequency is slowly varying. Then we find 
from (10.50) that the energy must vary in proportion to the frequency. 

If, for example, we gradually change the length of a pendulum, 
we can readily conclude with the help of (7.3) that the energy of 
the oscillations is inversely proportional to the square root of the 
length. Expressing energy in terms of the angular oscillation ampli¬ 
tude, (p 0 , according to the formula E = mgl^y 2, we conclude that 

qW' 3/4 . 

The conclusions concerning a pendulum can, of course, be obtained 
in elementary fashion. For this we must calculate the work against 
the tensile force of the thread done in shortening it over one oscilla¬ 
tion period. If the thread was shortened by A l, only one half the 
work done—the difference between the work done in raising the 
highest and lowest points of the pendulum—transforms into the 
oscillation energy. The tensile force in the thread due to the oscilla¬ 
tions of the pendulum is equal to mv 2 /l, where v = Zco(p 0 sin to t. 
Taking into account that the mean square of the sine in one oscilla¬ 
tion is equal to 1/2, we find that the work on extending the point of 
suspension that transforms into the oscillation energy is 
—(1/2) mg<p2AZ/2. Since the oscillation energy itself is equal to 
mgl q)^/2, we obtain the ratio 
J±E__ A l 

E ~~ 21 

Passing from differences to differentials and integrating, we obtain 
El i/2 = constant 
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EXERCISES 


1. Find the transformation function from variables P, Q to variables 
/, u? for a linear harmonic oscillator. 

Solution . The action integral for a linear harmonic oscillator is 

S’ 0 = j P dQ= ^ (2E — (o 2 «? 2 ) 1/2 dQ 

According to the general rule, into this we must substitute the energy 
integral expressed in terms of the action variable / according to (10.50) f 
or E = /co. Then the action 


Sq = J arc sin Q 


/ co \ 1/2 
\ 2 / ) 




In this form it is the transformation function involving the old coor¬ 
dinate Q and the new momentum J. From (10.13) and (10.16), the transfor¬ 
mation function expressed in terms of the old coordinate Q and the new 
coordinate w is connected with S' 0 by the relationship 


S o — Sq — wJ = S q — J 


dS' 


ox? / 2/ 


dJ 




1/2 


This yields the required transformation function expressed in terms of 
the old coordinate and the new momentum. Into this we must substitute 
the new momentum, connected with the new coordinate by the formula 

j_ «>Q 2 

2 sin w 


We finally obtain the required function in the form 
S Q = —cot w 
(Poincare’s transformation). 

2. Construct the equation of constant-action surfaces for a system of 
material particles emerging from one point in space with the same absolute 
velocity u 0 in a field of gravity. 

Solution . Taking the initial point as the origin of the coordinate system, 
we have 


V x — Px — mv 0x’> x — v 0 xt 

Vy = V0yi Py =z WVQyj y = V()yt 

v z = v 0z — gf > Pz = ”W0z — ngty * = v 0z t — gt 2 /2 

Eliminating the initial velocity conditions defining the path of each separate 
particle, we obtain for the surface of constant action S as a whole 


Px — m — — 


dS 

dx 


y 


Py — m ~ 

c m ( x 2 y 2 z 2 g 2 i 6 

5= t \—+— + - r - gzt -^2 


dS 
dy » 


Pz = —. -o- = 


mgt 
2 “ 


dS 

dz 
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From the expression for S we find the energy: 

3. Determine the action variables for Kepler’s problem. 

Solution . From (10.32) we determine the expressions for the action 
variables J& and J T : 

o 

S l/2 
(A/ 2 —p 2 /sin 2 ft) d $ 

Jl-00 

r t 

2nJ r = 2 j [2m(E — U) — M 2 /r*] U2 dr 

ri 

where the radicands vanish at the integration limits. Substituting U = 
= —air, we obtain after some simple computations 


Jq = M — p v 



Knowing, in addition, that = p,,, we express the energy in terms 
of the action variables: 

F= _ a2m 

2 (J r + J*+Jv) 2 

Thus, all three rotation frequencies coincide: 

dE a 2 m 

6V = coo = ^= —= . + 

which corresponds to a closed path. 

4. The energy lost by the sun through radiation is generated at the 
expense of its mass. Determine how this affects the orbits of the planets. 

Solution. We proceed from the equation of an orbit in its plane (5.12c). 
The constant a in the expression for the potential energy is equal to Grammy 
where G is the gravitational constant. In the present case, the variable para¬ 
meter X is the mass m 0 of the sun. We express the eccentricity of the orbit e, 
the major semiaxis b , and the rotation period x in terms of the action varia¬ 
bles, which do not change in a slow variation of the parameter m 0 (adiabatic 
invariants): 

e _(i_ ^L_V /2 b = __ 

V <p) 2 / ’ Gm 2 m 0 (\ —e 2 ) 

2jc 2jt ( J r -{- J ^)3 

T_ ” o) r G 2 m?m% 



PART II 


ELECTRODYNAMICS 


11 


VECTOR ANALYSIS 

The equations of electrodynamics are considerably simplified if they 
are written in vector, or in some cases tensor, form. Vector notation 
eliminates the arbitrariness associated with the choice of coordinate 
System and reveals the physical content of equations more vividly. 
In this sense tensor notation, which is employed in relativity theory, 
provides even greater advantages, making it possible to refer equa¬ 
tions not only to arbitrary coordinates but to any inertial frame of 
reference. 

In the foregoing discourse we assumed that the reader was familiar 
with elements of vector algebra, though we introduced certain clari¬ 
fications. Furthermore, in Section 9 we offered a definition of ten¬ 
sors in which vector quantities represent a special case. 

Now, in electrodynamics, vector differential operations are used. 
In the first part there was only one such operation: differentiation 
of a scalar with respect to a vector. This was used to calculate 
momentum from the Lagrangian or force from the potential energy 
formula. But it may be necessary to differentiate a vector with res¬ 
pect to a vector, and it is most useful to retain the vector notation, 
which offers a better idea of the geometric meaning of the operations. 

This section is devoted to vector differential operations, as well 
as to tensor algebra, inasmuch as it is required in studying the theory 
of relativity. 

Vector of an Area. We shall start with defining the vector of an 
area element dS. This is a vector normal to the area, numerically 
equal to its surface, and related to the direction in which the bound- 
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ary of the area is traced as the displacement of a corkscrew is rela¬ 
ted to the direction of rotation of its handle (Figure 14). This defini¬ 
tion covers, in particular, the vector , or cross , product of two vectors, 
which is numerically equal to the area of a parallelogram constructed 
with the two vectors as its sides, and, as a vector, is perpendicular to 
its plane. In Section 4 we offered detailed proof that the result 



Figure 14 Figure 15 

# 

of a vector product is a vector, at least with respect to rotations 
of the coordinate system. This proof is also valid with respect to the 
vector of any area element. 

We shall make use of a right-handed coordinate system x, y, z 
in which the rotation, looking from the z axis, is counterclockwise 
from x to y (Figure 15). In this coordinate system the area vector 
can be resolved into components as follows* 

dS x = dy dz , dS y = dz dx y dS z — dx dy 

Flux of a Vector. Suppose now that a liquid of unit density 
(“water”) is flowing across the area, the flow velocity being denoted 
by v. We denote the angle between dS and v as a. In Figure 16 are 
shown the streamlines passing through dS. They are parallel to the 
velocity v. Let us compute the per-second rate of flow of the liquid 
across the area dS. It is equal simply to v dS ', where dS' is the 
area perpendicular to the streamlines, as shown in Figure 16. Indeed, 
the amount of liquid passing through dS' in unit time is equal to 
a cylinder of base dS' and altitude v. But dS f = dS cos a, whence 
the required rate of flow of the liquid is 


dJ — v dS' = v dS cos a = v dS 


(11.1) 
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By analogy, the scalar product of any vector A (taken at the 
point of infinitesimal area) multiplied by dS is the flux of vector A 
across the area dS. Similar to the way that the flow of liquid across 
a finite area S is equal to the integral of dJ with respect to the area: 

/ = j V dS (11.2) 

the integral 

/ = j A dS (11.3) 

is called the flux vector A across the area. 

The area vector is introduced so that we can make use e con¬ 
venient coordinate-free notation of (11.3). The integrals appearing 



in (11.3) are double. In terms of the projections (11.3) can be writen 
thus: 


/ = j A dS = 

where the limits of the double integrals are determined from the 
corresponding projections of the area boundary on the coordinate 
planes. 

The Gauss Theorem. Let us now calculate the flux of a vector across 
a closed surface. For this we shall consider, first of all, the infinites¬ 
imal closed surface of a parallelepiped (Figure 17). We shall make 
the convention that the normal to a closed surface will always be 
taken outwards from the volume. 

Let us calculate the flux of vector A across the area ABCD (the 
direction of traverse being in agreement with the direction of the 
normal). Since the flux is equal to the scalar product of A by the 


£ j A x dy dz + j j A y dz dx + j j A z dy dx 
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vector of area ABCD in the negative ^-direction (and hence equal 
to — dy dz ), we obtain for this infinitely small area 

dJ abcd = — A x (x) dy dz 

We get a similar expression for the area A'B'C'D', only in this 
case the projection dS x is equal to dy dz , and A x is taken at the 



point x + dx instead of x. Therefore 
dJA'B'C'D' = A x (x + dx) dy dz 

Thus, the resultant flux across both areas perpendicular to the x 
axis is 

dJA'B'C'D ' "l - dJ abcd ~ \ A x (x -|- dx) — A x (x)] dy dz 

=-■ —fix' dx dy dz (11.4) 

We have utilized the fact that dx is an infinitely small quantity, 
and we have expanded A x (x + dx) in a series A x {x + dx) = 
= A x (x) + ( dAjdx) dx. The resultant fluxes across the boundaries 
perpendicular to the y and z axes are formed similarly. The net 
flux across the whole parallelepiped is 

/ dA x dA u dA z \ 

dJ =(-\dT+-ir+-dr) dxd y dz < 11 - 5 ) 

A finite closed volume can be divided into small parallelepipeds, 
and the relationship (11.5) applied to each one of them separately. 
If we sum all the fluxes, the adjacent boundaries do not contribute, 
since the flux emerging from one parallelepiped enters the neighbour¬ 
ing one. Only the fluxes through the outer surface of the selected 
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volume remain, since they are not cancelled by others. But the 
right-hand sides of (11.5) will be additive for all the volume ele¬ 
ments dV = dx dy dz, yielding the very important integral theorem: 




dAy_ 

dy 



( 11 . 6 ) 


It is called the Gauss theorem . 


The Divergence of a Vector. The expression appearing in the 
right-hand side under the integral sign can be written in a much 
shorter form. We first of all note that it is a scalar expression, since 
there is a scalar in the left-hand side of (11.6) (the “mass of water”), 
and dV is also a scalar. This expression is called the divergence of 
vector A and is written thus: 


div A = 


dA x 

dx 


OAy 

dy 


dA z 

dz 


(11.7) 


Divergence can be defined independently from any coordinate 
system, if (11.5) is used. Indeed, from (11.5) the difinition for diver¬ 
gence follows as 

A dS 

divA = lim —-— (11.8) 

v+o v 


The divergence of a vector at a given point is equal to the limit 
of the ratio of the vector flux across the surface surrounding the 
point to the volume enveloped by the surface, when the surface is 
contracted to a point. 

Let us suppose that A denotes the velocity field of some fluid. 
Then, from definition (11.8), it can be seen that the divergence of A 
is a measure of the density of the sources of the fluid, or the number 
of sources per unit volume from which a unit mass of the fluid flows 
in unit time. Obviously, the more sources there are per unit volume, 
the more fluid will flow out of it. If div A is negative, we can speak 
of the density of sinks. But it is more convenient to define the source 
density with the corresponding sign. We note that from (11.7) there 
follows the quantity 


divr = 5 -+ f +^= 1 + 1 + 1 = 3 


(11.9) 


since r has components x, y, z. 

It is not hard to derive the same result from the definition of 
divergence (11.8) not involving the coordinates. First, from the 
origin of the coordinate system we construct a cone containing an 
infinitesimal solid angle at the z vertex (Sec. 6). Since the radius 
vector coincides with the generatrix and is consequently perpendicu- 

9 -0452 
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lar to the normal to the element of the side surface of the cone, there 
is no vector flux across the surface: the vector slips along without 
penetrating it. The radius vector flux only crosses the base of the 
cone. 

Since the cone angle is infinitesimal, the base is equal to r 2 d£2. 
The radius vector is perpendicular to the base at all points, hence 
the flux of the radius vector across the base is r X r 2 dQ. The volume 
of a cone is equal to the product of the base area times one-third 
the altitude, that is, again the radius. Substituting the flux of the 
radius vector and the volume of the cone into (11.8), we obtain 

div r = 3. Actually in this proof there was no need to assume the 

cone infinitely small. We did this only to make use of the general 
definition of the divergence of a vector. 

Circulatory Integrals and Stokes’ Theorem. Let us consider the 
vector integral over a closed contour: 

C — J A d\ = J (i4 x dx -|- A t j dy A z dz ) (11.10) 

This single integral is called the circulation of the vector over the 

closed contour. For example, if A is the force acting on any particle, 



Figure 18 

then A d\ = A dl cos a is the work done by the force on the contour 
element dl, and C is the work done in covering the whole contour. 

Now let us prove that the circulation of a vector A around a con¬ 
tour can be replaced by the integral over the surface stretched on the 
contour. Consider the projection of an infinitely small rectangular 
contour on the y,z- plane. Let this projection also have the form 
of the rectangle in Figure 18. We shall calculate the circulation 
of A around this rectangle. The side AB contributes a component 
A y (z) dy , and side CD a component — A y (z + dz) dy , where the 
minus sign must be written because the direction of the vector CD 
is opposite to that of the vector AB. We obtain, for the sum due 
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to the sides AB and CD , 


— A y (z + dz) dy -f A y (z) dy= —^ dy dz 


(we have expanded A y (z + dz) in a series in dz). For the sides BC 
and DA , 

A (y + dy) dz — A z (y) dz = ^dy dz 
The resultant value for circulation in the y,z-plane is 


dC ; 


yz 


= y i *=(w--£) ds ’= B 


= B ~ dS a 


The notation B x is clear from the equation. 


( 11 . 11 ) 


If the contour is arbitrarily oriented in space, all its infinitesimal 
segments must, in accordance with (11.10), be projected on the 
three coordinate axes. Then the whole contour has projections on 
the three coordinate planes, and the circulation resolves into the 
sum of three expressions of the form (11.11). 

The circulation taken along the small contour is 


dC = B x dS x + B y dS y + B z dS z = B dS (11.12) 

where B x , B y , and B z are the abbreviated notations of the following 
differences between partial derivatives: 


1 

rf 1 

II 

H 

0Q 

(11.13) 

d _ dA x _ dA z 

v dz dx 

(11.14) 

■n dA V # A x 

dx dy 

(11.15) 


Circulation is, by definition (11.10), a scalar quantity. Consequent¬ 
ly, the quantity in the right-hand side of (11.12) is also a scalar. 
But since <2S is a vector, B is also a vector whose components are 
defined by Eqs. (11.13)-( 11.15). 

Vector B has a special name, the rotation , or curl, of vector A, 
and is denoted thus: 

B = curl A 

Curl A is resolved along the coordinate axes with the help of three 
unit vectors: 

B = curl A 



(11.16) 


9 * 
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Comparing (11.16) with (11.12), we see that Eq. (11.12) contains 
the component of curl A normal to the area: 


^ A d\ = curl n A dS 


(11.17) 


where the subscript n of curl A indicates that we should take not 
the whole vector curl A but only its projection on the normal to 
the area, dS. Equation (11.17) enables us to determine curl A in 
a coordinate-free manner, similar to the way we defined div A 
in (11.8), namely 


curlrt A = lim 
s-o 


S 


(11.18) 


so that the normal component of curl A to any area at a given point 
in space is the limit of the ratio of the circulation along the boundary 
of the area to its surface, when the boundary contracts into a point. 

For the integral $ A d\ to be nonzero, we must have closed vector 
lines, to some extent following the integration line, which lines 
are similar to the closed lines of flow in a liquid in vortex motion. 
Hence the term curl, or rotation. 

If the circulation is calculated for a finite contour, then the con¬ 
tour can be broken up into infinitely small cells to form a grid. 
For the sides of adjacent cells, the circulations mutually cancel 
since each side is traversed twice in opposite directions; only the 
circulation along the external contour itself remains. The integral 
in the right-hand side of Eq. (11.17) gives the flux of curl A across 
the surface stretched on the contour. Thus, we obtain the desired 
integral theorem 

J A dl = j curl A dS (11.19) 

which is called Stokes' theorem . 


Differentiation with Respect to the Radius Vector. The divergence 
and curl of a vector are its derivatives with respect to the vector 
argument. They can be reduced to a unified notation in the following 
way. We introduce the vector symbol V (del, or nabla 1 ) with com¬ 
ponents 


V* 


dx ’ 


V, 


dy ' 



( 11 . 20 ) 


Then premultiplication by del denotes differentiation with respect 
to the radius vector. But there are two ways of multiplying vectors. 


1 After an ancient harp~whose form V resembles. 
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The scalar product of the vector by vector A should look like this: 

(V-A) = V*A + VA + v A = It + IF + IT = div A 

( 11 . 21 ) 

Another method is vector multiplication, which is defined as 
follows: 


vxA S3 n<*> (ViA - V z A y ) + n(v) ( Vz A x - V X A Z ) 

+ n (z ) (V x A y — V y A x ) = curl A (11.22) 

We have used the identity sign everywhere to emphasize that 
we are simply dealing with a new system of notation. It is very 
convenient in vector analysis, because the operations are graphic 
and the equations are concise. Use of the del operator in the proof 
of various general relationships makes it simply unnecessary to 
resolve vectors into components. 

In algebraic operations the del is in every way similar to a con¬ 
ventional vector. Multiplication by del signifies its operation on the 
given expression, if it is differentiated. Sometimes the del is multi¬ 
plied by a vector (usually post-multiplied) without operating on it 
as a derivative. In that case it operates on another vector (see (11.30) 
and (11.32)). 

If we operate with V on a scalar qp, we obtain a vector which is 
called the gradient of scalar q>: 

grad <p e V* = n<*> -g- + n<*> g + n <*> -g (11.23) 

Its projections are: 

g, Vrf—g. (11.24) 

From Eqs. (11.24), it can be seen that the vector V<P is perpendic¬ 
ular to the surface (p = constant. Indeed, if we take a vector d J 
lying on this surface, then, in a displacement d 1, (p does not change 
(by the definition of dl). This is written as 

dq, = gd4+gd/ I , + gdZ 2 = (V(p.dl) = 0 (11.25) 

that is, V<p is perpendicular to any vector which lies in the plane 
tangential to the surface cp = constant at the given point, which 
accords with our assertion. 

Differentiation of Products. In considering the rules of differential 
operations with V it should be remembered that with respect to 
differentiation the del is a derivative sign, while with respect to the 
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rules of transformation of a coordinate system it is a vector that 
is multiplied like any other vector. 

First of all, the gradient of the product of two scalars is calculated 
as the derivative of a product: 

grad (qn|)) = V9^ = <p V*p + ^ V<p 

= grad 9 + 9 grad \|) (11.26) 

The divergence of a product of a scalar with a vector is calculated 
thus: 

div <pA = (Vcp-qpA) + (Va ’ 9 a ) = A (V 9 ) + 9 (V* A) 

= A grad 9 + 9 div A (11.27) 

Here the indices 9 and A attached to V show what V is applied to. 
We find the curl of 9 A in a similar manner: 

curl 9 A = V<p X 9 A + Va X 9 A 

= grad 9 X A + q> curl A (11.28) 

Now we shall operate with V on the product of two vectors: 

div (A X B) = V (A X B) 

= Va 1 (A X B) + Vb (A X B) 

We perform a cyclic permutation in both terms, since V can be 
treated in the same way as an ordinary vector. In addition, we have 
put B after Vb in the second term, and here, as usual, we must 
change the sign of the vector product. The result is 

div (A X B) = B (Va X A) - A ( Vb X B) 

= B curl A — A curl B (11.29) 

Let us find the curl of a vector product. Here we must use the 
relationship A X (B X C) - B (A-C) — C (A - B) to get 

curl (A X B) = VX(AXB) = Va X(AX«) + Vb X(AX») 
= ( B • Va) A - (Va • A) B + (Vb • B) A - ( A • Vb) B 
= (B*V)A — Bdiv A + A div B —(A-v) B 

(11.30) 

We note the new symbols (B*V) and (A-V) operating on the vec¬ 
tors A and B. Obviously, (A-v) and (B-V) are symbolic scalars, 
equal, by definition of V, to 

(A ■ V) - 4.V, + 4,V, + A,V, - A, £ + A, ± + A, £ 

(11.31) 
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and similarly for (B-V). Then, (A-V)B is a vector which is obtained 
by application of the scalar operation (A*V) to all the components 
of B, in accordance with (11.31). Since (A*V) is a scalar operation 
it can be introduced under the vector multiplication sign—dot or 
cross—keeping in mind its differential properties. 

Now we only have to compute the gradient of a scalar product 
with operations of this type: 

V (A- B) = Va (A- B) + Vb (A- B) 

We use the same transformation as in the preceding case: 
grad (A- B) = (B*Va)A + B X (Va X A) 

+ (A-Vb)B + AX(V&X B) 

= (B-V)A-f B X curl A 

+(A-v) B + A X curl B (11.32) 


Certain Special Formulas. We note certain essential cases of 
operations involving V- 

From the definition of divergence (11.7), we obtain from (11.27) 
and (11.9) 

div -^-=73- divr + r g rad 73- = -^--^^ = 0 ( 1L33 ) 


In taking the gradient of r~ 3 we applied the rule for differentiating 
a composite function: we first differentiated r -3 with respect to r 
and then took V r . This is done as follows. Knowing that r = 

= (z 2 + y 2 + z 2 ) 1/2 , we find 

dr _ ^ j* _ x _ x 

dx~ x “ (x 2 +i / 2 + z 2 ) 1/2 _ r 

Going over from the component with respect to x to vector notation, 
we obtain 

grad r = Wr— (11.34) 

whence 



The curl of a radius vector is zero. For example, for the component 
along the x axis we have 


I 

curler = — 
x dy 

and in general 

curl r = 0 


dy 


dz 


(11.35) 
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We now take 


( a -V*=a x £+a„<±+a z £=a x 

and for all components of r 

(A-V) r = A (11.36) 

We shall now show how to operate with V on a vector whose 
components depend only upon the absolute value of the radius vec¬ 
tor r. As in the case of a scalar quantity, the rule of differentiating 
a composite function must be applied, with r Ir substituted for Vr 
according to (11.34). For the divergence of A (r) we obtain 

divA(r)=(-^-Vr )=-^(A-r) (11.37) 

where A is a total derivative of A (r) with respect to the argument r, 
that is, a vector whose components are the derivatives of the three 

components of A (r) with respect to r: A x , A y , A z . Further, 

curlA(r) = VrXA (rXA) (11.38) 


Repeated Differentiation. Let us investigate certain results con¬ 
cerning repeated operations with V. 

The curl of the gradient of a scalar is equal to zero: 

curl grad <p=vXV<p = (VXV)<p = 0 (11.39) 

since the vector product of any vector (including V) by itself is 
equal to zero. This can also be seen by expanding curl grad q) in 
terms of its components. The divergence of a curl is also equal to 
zero: 

div curl A = V (V X A ) = (V X V) A = 0 (11.40) 

Let us write down the divergence of the gradient of a scalar <p 
in component form. From Eqs. (11.7) and (11.24) we have 

div grad cp= (V-V) <P = 0 + 0 + 0 = V 2 <p (11.41) 

Here V 2 is the so-called Laplace operator, or Laplacian: 


V 2 = 


d* . d 2 d* 
dx 2 ’ dy 2 ' dz 2 


Finally, the curl of a curl can be expanded as a double vector 
product: 

curl curl A = V X (V X A) = V (V* A) — (W) A 

= grad div A — v 2 A (11.42) 
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The latter equation is usually used to determine V 2 A, since in 
curvilinear coordinates V 2 <p and V 2 A are expressed differently. 
It cannot be said that V 2 A is the divergence of a vector gradient, 
because the gradient of a vector has not been defined. 

Curvilinear Coordinates. In many problems it is useful to go over 
from rectilinear to curvilinear coordinates. We shall now show how 
various vector operations can be written down in curvilinear coor¬ 
dinates. 

Curvilinear coordinates < 71 , q 2 , q 3 are termed orthogonal if only 
the quadratic terms dq\, dql, dq\ appear in the expression for the 



element of length dl 2 , and not the products dq x dq 2 , dq x dq z , dq 2 dq 9 , 
similar to the way that dl 2 = dx 2 + dy 2 + dz 2 appears in rectangular 
coordinates. In orthogonal coordinates 

dl 2 = h\dq\-\-hl dql + hl dql (11.43) 

For example, in spherical coordinates q x = r, q 2 = ft, = <p. 
The element of length in spherical coordinates is (see (3.10)) 

dl 2 = dr 2 + r 2 dft 2 + r 2 sin 2 ft rfcp 2 

so that 


h x = 1 , h 2 == r, h 3 = r sin * 


Let us construct an elementary parallelepiped (Figure 19). Then 
the components of the gradient are: 


grad 2 ij) = 


1 # 4 ? 

h 2 dq 2 

1 dty 
h z dq z 


grad 3 tp 


(11.44) 
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In order to find the divergence we repeat the proof of the Gauss 
theorem for Figure 19. The area ADCB is equal to h 2 h 3 dq 2 dq 3 . 
The flux of vector A across it is 

4i(?i) h 2 h 3 dq 2 dq 3 

Here, h 2 and h 3 are taken for the same value of q x as A± is. The sum 
of the fluxes across the areas ADCB and A'B'C'D ' is 

Q 

-^-(h^AJ dqi dq 2 dq 3 

where we have used the expansion of the quantity h 2 h 3 A x at the 
point q x + dq Y in terms of dq u in a way similar to (11.4). The total 
flux across all the boundaries is 

dJ = (^2^3^l) + (^3^1^2) 

+ ■^•(AiMa)] dqi <k 3 

Let us now take advantage of the definition of divergence (11.8) 
to get 


d/ = div A dV = div A (h 1 h 2 h 3 dq± dq 2 dq 3 ) 

Hence 

divA= -iS ‘[¥ (ftjMi)+ ~k {hM) 

+ ^WA)] (H-45) 

If instead of Ai, A 2 , A s we substitute the expressions (11.44), 
the result will be the Laplacian of a scalar in orthogonal curvilinear 
coordinates. In spherical coordinates it is 




dij) 


r 2 sin dft 


Sin 


3d 

1 3 2 iJ) 


sin 2 dr 2 3(p 2 


(11.46) 


With the aid of Stokes’ theorem, we can also calculate the curl 
in curvilinear coordinates. Without repeating the proof of theo- 


rem (11.19), we 

write: 



curli A 

_ 1 



h 2 h 3 

curl 2 A 

1 

1 

1 °° 

ir A * h >) 


curl 3 A : 

__ 1 

~~ hih 2 

("4r A * hi ~ 

ik Aihx ) 


(11.47) 
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If it is necessary to form the Laplacian of a vector, the following 
procedure is adopted. Apply the operation of finding the gradient 
of the divergence of the vector according to (11.44) and (11.45). 
Then develop the curl of the curl by performing operation (11.47) 
twice. From (11.42), the difference between the two expressions 
yields the required Laplacian. Here, generally speaking, we find 
that some component of the Laplacian of a vector depends not only 
on the same component of the vector itself but on the other two as 
well. 


Transformation to Tensor Notation. We shall now show how some 
vector operations are written in tensor notation. We always make 
use of the summation rule (see Part I, Section 9). 

Then the divergence of a vector in orthogonal coordinates is 
written very simply: 


div A 


dA x dA y dAz = d A a 

dx ' dy dz dx a 


(11.48) 


Tensor operations in curvilinear coordinates are much more in¬ 
volved and will not be required in this book. 

In tensor form the Laplacian of a scalar quantity is written as 
follows: 


V 2 ^ 


d d d 2 i|? 

dx a dx a ^ dx 2 a 


(11.49) 


The Laplacian of a vector in Cartesian coordinates is written 
similarly. 

We introduce the concept of an invariant tensor , as we call a tensor 
which retains its form in a transformation to another coordinate 
system. We have already encountered one such tensor: the tensor 
6^ v in Section 9. Another tensor possessing this property is the ten¬ 
sor of rank 3, whose all components are equal to +1 if the 

indices are in cyclic order (123, 312, and 231), to —1 if the order 
is not cyclic (213, 132, and 321), and zero if any two of the indices 
p, v, X are equal. 

In other words, it can be said that the tensor 8^*, is antisym¬ 
metric with respect to any pair of indices: a permutation of them 
reverses the sign of its component. Indeed, component 213 differs 
from 123 by one permutation and is therefore negative, while 312 is 
obtained from 123 by two permutations. Two sign reversals yield 
a plus. Obviously, permutations of the same indices affect nothing, 
but on the other hand, it should reverse the sign of the component. 
Hence a component with identical indices is equal to zero, since 
only zero is equal to itself with its sign reversed. 

5 The property of antisymmetry of a tensor with respect to any 
pair of indices is conserved in a rotation of the coordinate system or ? 
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in other words, it is an invariant property. Indeed, the equality 
8jxvx = —is of tensor form and is therefore valid in any coor¬ 
dinate system. This can be verified directly by applying the tensor 
transformation formulas. 

But it also follows from this that the same components of tensor 
e^vx differ from zero as those of tensor e^x in the initial coordinates 
system, and that they are of the same sign. It remains to show that 
the components of e'^vx are also equal to ±1. 

We write the general transformation formula of a tensor: 

=([■*, a)(v, P)(X, y)e aPv (11.50) 

But we have seen that the symmetry of the tensor e^ v x is the 
same as that of the initial tensor e aPv . Hence, e^ v x = Ce^x, 
where C is a number that has to be determined. We introduce it 
into (11.50) and multiply both sides of the equation by e^x- In the 
left-hand side we have C multiplied by the sum of the squares of 
the components, e^xe^x. There are altogether six such components, 
which we listed before. In the right-hand side we have a determi¬ 
nant made up of the transformation coefficients (p, a) which is also 
multiplied by 6. 

Indeed, the coefficients (u, a), (v, P), and ( X , y) are taken each 
time in threes and, according to the properties of e aPv , only from 
different rows and columns of the table, with the sign corresponding 
to whether the permutation of the rows and columns is even or odd. 
The factor 6 is obtained because in the summation one triplet of 
indices can be taken arbitrarily, while the number of permutations 
of three by three is 3!, or 6. 

After cancelling out 6 we find that the required number C is equal 
to a determinant made up of the coefficients of the rotation, which 
is equal to unity. Hence e^x is an invariant tensor which retains 
its form in any rotation of the coordinate system. 

We now form the following combination from two vectors and 
a tensor: 


D a = e a fi y ApB y (11.51) 

The. quantity Z) a written in terms of its components has the following 
values: 


D\ — A 2 B 3 — A$B 2 , D 2 — AqB i— AiB$i D 3 — A\B 2 — A 2 Bi 

We have obtained the components of a vector product. Equa¬ 
tion (11.51) immediately shows that a vector product behaves like 
a vector in a transformation of the coordinates, because (11.51) 
is a tensor equation: such equations are valid in any coordinate 
system and consequently lack the arbitrariness associated with its 
choice. Any equation expressing a physical law must possess this 
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property, which is the main reason for the use of vector and tensor 
equations in physics. 

The curl of the vector is determined similarly to (11.51): 

dA 

e«Bv - tej" = ( V X A)a = cur la A (11.52) 


EXERCISES 

1. Use Eqs. (11.26)-(11.42) to calculate the following expressions with¬ 
out introducing components: (a) V 2 (1/r) at r =£ 0; (b) div (p (r) r, curl <p (r) r; 
(c) V (A-r), where A is a constant; (d) V (A (r)-r); (e) div <p (r) A (r), 
curl ip (r) A (r); (f) div I r X(A X r)],Jwhere A = constant; (g) curl r[ X 
X(A X r )l» where A = constant; (h) V 2 A (r), see (11.42); (i) V (A (r)*B (r)); 
(j) curl (A X r h where A = constant; (k) div (A X r )» where A = con¬ 
stant; (1) V 2 (r/r). 

Answers, (a) V 2 (1/r) = div grad (1/r) = — div (r/r 3 ) = 0; (b) 3cp + r<p 
and 0; (c) A; (d) A + r (r*A)/r; (e) cp (r*A)/r + (p (r*A)/r and (p (rX A)/r + 
+ q> (rX A)/r; (f) -2 (A-r); (g) 3 (r X A); (h) A + 2A/r; (i) r (A-B)/r + 
+ r (A-B)/r; (j) 2A; (k) 0; (1) -2r Ir. 

2. Write V 2 \|? in cylindrical coordinates. 

3. Write the three components of V 2 A in spherical coordinates. 

4. Prove that e a 3 V e aplv = an( l deduce from this 

the rule A X(B XC) = B (A-C) — C (A-B). 

5. Solve the tensor equation x a x$A$ — A a = ax a , where x a is the 
unknown vector, and vector Ap and scalar a are given. 

Solution . Multiply both sides of the equation by A a , which leads to 
a quadratic equation for the scalar A a x a . Solving it, we substitute the result 
into the initial equation, which yields a linear equation with respect to the 
components of vector x a . 

6. Given a straight line and a point O at a distance a from it. Let a 
point in the plane through the line and the given point O be at a distance z 
from the line and r from point O. Introducing the coordinates £ = (r + z)l 2, 
r\ = (r — z)l 2, and <p, where <p is the angle of rotation around an axis 
through O perpendicular to the straight line, write the Laplacian in terms 
of these coordinates. 
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MAXWELL’S EQUATIONS 

Interaction in Mechanics and in Electrodynamics. Interactions in 
electrodynamics take place not between individual charges, but 
between charges and the surrounding electromagnetic field. The 
physical concept of a field in electrodynamics differs essentially 
from the field concept in Newtonian mechanics. 

We know that the space in which gravitational forces act is called 
a gravitational field. The value of these forces at any point of the 
field is determined, in Newtonian mechanics, by the instantaneous 
positions of the gravitating bodies, no matter how far they are 
from the given point. In electrodynamics such a field representation 
is not satisfactory: during the time that it takes an electromagnetic 
disturbance to move from one charge to another, the latter can move 
a very great distance. Elementary charges (electrons, protons, me¬ 
sons) very often have velocities close to the velocity of propagation 
of electromagnetic disturbances. 

The modern theory of gravity (the general theory of relativity) 
shows that gravitational interaction, too, propagates with a finite 
velocity. But since macroscopic bodies move considerably slower, 
within the scale of the solar system the finite velocity of propagation 
of gravitational forces introduces only an insignificant correction 
to the laws of motion of Newtonian mechanics. 

In the electrodynamics of elementary charges, the finite velocity 
of propagation of electromagnetic disturbances is of fundamental 
significance. If the action of a field affects the energy or momentum 
of a charged particle, the change can be directly transmitted only 
to the surrounding electromagnetic field, because, for the energy 
and momentum of other particles to change, a finite time interval 
is required before the electromagnetic disturbance excited by the 
charge reaches them. But this means that the electromagnetic field 
itself possesses energy and momentum, otherwise these two impor¬ 
tant mechanical quantities would not always be conserved, vanishing 
at the instant when the signal is emitted and reappearing at the 
instant when it is received. 

In Newtonian mechanics it is assumed that a disturbance is trans¬ 
mitted instantaneously, hence there is no need to ascribe momentum 
or energy to the field: as soon as one gravitating particle releases 
a certain momentum or energy another immediately acquires them. 

Since, as has just been pointed out, an electromagnetic field 
possesses momentum and energy, it can be treated as an independent 
physical entity in exactly the same way as charged particles. The 
equations of electrodynamics must directly describe the propagation 
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of electromagnetic disturbances in space and the interactions of the 
charges with the field. 

Interaction between charges is effected through the electromagne¬ 
tic field. Such laws as the Coulomb or Biot-Savart laws, in which 
only instantaneous positions or instantaneous velocities of the 
charges appear, are of an approximate nature and hold only when 
the relative velocities of the charges are small compared with the 
propagation velocity of the electromagnetic disturbances. This 
velocity is a fundamental constant appearing in the equations of 
electrodynamics. It is to a great degree of accuracy equal to 3 X 
X 10 10 cm-s -1 . 

The reality of the field is particularly evident from the fact that 
electrodynamic equations admit of a solution in the absence of 
charges. These solutions describe electromagnetic waves in vacuum, 
in particular light and radio waves. 

For two centuries, the supporters of the wave theory of light 
considered that light waves were propagated by a special elastic 
medium permeating all space, the so-called “ether”. In order to 
represent the spread of oscillations it was, naturally, necessary 
to have something oscillating. This “something” was called the 
ether . Proceeding from an analogy with the propagation of sound 
waves in a continuous medium, the ether was endowed with the 
properties of a fluid, physical phenomena being explained simply by 
reducing them to definite mechanical displacements of bodies. 
In particular, light phenomena were regarded as displacements of 
particles of the special medium, the ether. 

The enunciation of the electromagnetic theory of light led physi¬ 
cists to the conclusion that the electromagnetic field is real in the 
same sense as matter. Moreover, the laws of electrodynamics should 
form the basis for the deduction of the complex laws governing the 
motions of the atoms of matter, in particular, fluids. The bearer 
of electromagnetic field is physical space, which is inseparable from 
the states and motions of real entities. 

Electromagnetic Field. The investigation of the electromagnetic 
field began with its most apparent manifestations: the electric force 
produced by the rubbing of bodies, the properties of magnets, and 
the like. It was known for very long that bodies are divided into 
conductors and insulators, that breaking a magnet in half yields 
not two separate poles but two magnets with two poles each, etc. 

At the time, physicists were unable to explain why some bodies 
conduct electricity while others do not, why iron is strongly magnet¬ 
ic and copper is evidently not at all. Nevertheless, physicists were 
able, without taking up such problems, to learn some of the basic 
laws of electromagnetism, such, for example, as the Coulomb law of 
the interaction force between two charges, which does not depend 
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on the origin of the electrification of the bodies carrying the charges. 
The same can be said of Faraday’s law of electromagnetic induction, 
which relates magnetic and electric forces in a form that does not 
reflect the properties of specific bodies. 

In the first volume of this book we shall deal exclusively with 
the fundamental laws of electromagnetism. To formulate them there 
is no need to consider the properties of ponderable media, like metals, 
dielectrics, or ferromagnetics. We shall leave specific media alone 
for the time being and deal here only with separate charges and the 
fields they produce. 

This form of presentation of electrodynamics was formerly known 
as the “electron theory”, on the assumption that an electron is simply 
a point charge and nothing more. A real electron possesses very 
complex properties, and though it in no way manifests its actual 
dimensions in experiments, it bears little semblance to the idealized 
charge of the “electron theory”. 

In a sense the charge we shall be speaking of in presenting the 
subject-matter of this part of the book resembles the “mass point” 
of Newtonian mechanics. It is an idealized entity, for the present 
most suitable for the formulation of fundamental theoretical laws. 

In developing Maxwell’s equations—the fundamental equations 
of electrodynamics—we shall, for a time, have to make use of terms 
borrowed from the theory dealing with the properties of material 
media, such as the concept of a current-carrying circuit. Actually, 
though, we do not take account of the complex properties of real 
conductors. In fact, the term “conductor” is employed only for 
considerations of physical visualization: we have in mind an imagi¬ 
nary circuit along which a continuously distributed charge is moving. 

One remark of a terminological nature is called for. We shall 
everywhere say simply “field” instead of “field strength”. This is 
conventional in theoretical physics. 

We shall also use the Gaussian system of units, in which electric 
and magnetic fields are expressed in quantities of the same dimen¬ 
sions, g^cm” 1 ^ -1 . 2 

Electromotive Force. We shall begin with the definition of electro¬ 
motive force in a circuit: this is the work done by the forces of an 
electric field in the passage of a unit charge along a given circuit; 
it is absolutely immaterial whether the circuit is filled with a con¬ 
ductor or is simply a closed line drawn in space. In the latter sense 

2 This choice of units]^especially convenient in the theory of relativity 
(see Sec. 15), in which electric and magnetic field components or the absolute 
values of the fields are involved in the form of linear combinations. The reader 
will find more about different system of units in the book Units of Physical 
Quantities and Their Dimensions, by L. A. Sena (Mir Publishers, Moscow, 1972). 
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the concept of electromotive force can be applied to a cyclic induc¬ 
tion accelerator of electrons, the betatron. 

Let us write the expression of electromotive force (abbreviated 
as emf and denoted %) in the notation of Section 11. The force acting 
on a unit charge at a given point is, by definition, the electric 
field E. The work done by this force on an element of path d\ is the 
scalar product E <21. Then the work done on the whole closed circuit, 
or the emf, is equal to the integral 


g = jEdl (12.1) 

Suppose a surface is stretched over a given circuit. Denoting the 
magnetic field by H, we find that the magnetic flux across an element 
of the surface is, by the definition given in Section 11, dO = H dS. 
The magnetic flux across the whole surface stretched on the circuit is 


(D = j H dS 


( 12 . 2 ) 


It is important that the magnitude of the flux, <t>, does not depend 
on the specific form of the surface stretched on the circuit. This 
can be visually explained by the fact that the magnetic field lines 
cannot originate or terminate in an empty space devoid of magnets. 
Consequently, if two different surfaces are stretched over the circuit, 
the flux across each must be the same—it can neither decrease nor 
increase between them. 

Faraday’s induction law is written in the form of the following 
equation: 


% = 


1 d Q 
c dt 


(12.3) 


If all quantities are expressed in the Gaussian system, the propor¬ 
tionality factor is equal to 3 X 10 10 cm-s -1 . It is readily apparent 
from (11.1) and (11.2) that c has the dimensions of velocity. 

If a circuit is in vacuum, as in the case of an induction accelerator, 
the work done on the charge augments its energy. 


Maxwell’s Equation for curl E. Thus, Eq. (12.3) refers to any 
arbitrary closed circuit. We substitute Eqs. (12.1) and (12.2) into 
this equation to get 

r4-f H<is < 12 - 4 > 

The left-hand side of the equation can be transformed by Stokes’ 
theorem (11.19), and in the right-hand side the order of time dif¬ 
ferentiation and surface integration can be interchanged, since they 
are performed for independent variables. In addition, taking this 


10—0452 
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integral over to the left-hand side, we obtain 

j (curlE + -L-|L)dS = 0 (12.5) 

But, the initial circuit is completely arbitrary, that is, it can 
have arbitrary magnitude and shape. Let us assume that the expres¬ 
sion in parentheses in (12.5) is not equal to zero. Then we can choose 
the surface and the circuit that bounds it so that the integral (12.5) 
does not become zero. Thus, the following equation must be satis¬ 
fied: 


curl E + 45- = 0 (12.6) 

In comparison with (12.3), this equation does not contain anything 
new physically; it is the same induction law, but rewritten in dif¬ 
ferential form for an infinitely small circuit. In many applications 
the differential form is more convenient than the integral form. 

The Equation for div H. As we have already said, magnetic lines 
of force cannot originate or terminate in vacuum, that is, they are 
either closed or go off to infinity. Hence, into any closed surface 
the same number of magnetic field lines enter as leave. The magnetic 
flux in free space, across any closed surface, is equal to zero: 

j H dS = 0 (12.7) 

Transforming this integral to a volume integral according to the 
Gauss theorem (11.6), we obtain 

j div H dV = 0 (12.8) 

Since the surface bounding the volume is completely arbitrary, 
we can always choose this volume to be so small that the integral 
is taken over the region in which div H is of constant sign, if it is 
not equal to zero. But then, contrary to (12.7) and (12.8), div H 
will not be equal to zero. Therefore, the divergence of H must every¬ 
where vanish: 

div H = 0 (12.9) 

The expression (12.9) is the differential form of (12.7) for an infi¬ 
nitely small volume. 

In Section 11 it was shown that the divergence of a vector is the 
density of sources of a vector field. The sources of the field may be 
free charges, as in the case of an electric field But a magnetic field 
does not correspond to any free charges. 
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Equations (12.6) and (12.9) are together called the first pair of 
Maxwell's equations . Let us now introduce the second pair. 

The Equation for div E. The electric flux across a closed surface 
is equal to the total electric charge inside the surface multiplied 
by 4n (the Gauss law) 

j E dS = 4ite (12.10) 

This law is derived from the Coulomb law for point charges. 
The field due to a point charge e is expressed by the following equa¬ 
tion: 



Here, r is a radius vector drawn from the point where the charge is 
located to the point where the field is defined. The field is inversely 
proportional to r 2 and is directed along the radius vector. 

Let us surround the charge by a spherical surface centred on the 
charge. The element of surface for the sphere, dS, is r 2 dQr/r, where 
di 2 is a solid-angle element, and r Ir indicates the direction of the 
normal to the surface. The flux of the field across the surface ele¬ 
ment is 


EdS = 4-X - Xr 2 dQ-=edQ 
r 2 r r 

The flux across the whole surface of the sphere is j e dQ = e j d£l = 

= 4ji£. But since lines of force begin only at a charge, the flux will 
be the same through the sphere as through any closed surface around 
the charge. Therefore, if there is an arbitrary charge distribution e 
inside a closed surface, Eq. (12.10) holds. 

In order to rewrite this equation in differential form, we introduce 
the concept of charge density. The charge density p is the charge 
contained in unit volume, so that the total charge in a volume is 
related to the density by the following equation: 


e=jp dV (12.11) 

Hence, p = lim . Introducing the charge density into (12.10), 

AV -► 0 Av 

j (div E — 4np) dV = 0 (12.12) 

Repeating the same reasoning for this integral as we applied to 
(12.8), we obtain an equation of similar form for an electric field: 

div E = 4jip (12.13) 

10 * 
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According to (11.8), we can say that the density of sources of an 
electric field is equal to the electric charge density multiplied 
by 4n. 

In formulating the fundamental laws of electrodynamics it is 
often convenient to treat charges as point charges (remembering 
that these charges should not be physically identified with elec¬ 
trons!). For point charges the density function is given by means 
of a limiting process. 

Let us first assume that a charge of finite magnitude is distributed 
over a small, but also finite, volume AV. Then p must be regarded 
as the ratio el AV. If we let the volume AV tend to zero, then the 
density function will have a very peculiar form: it will turn out 
to be equal to zero everywhere except at the place where the charge 
is situated, and at that point it will convert to infinity, since the 
numerator of the fraction el AV is finite and the denominator is 
infinitely small. However, the integral 

remains equal to the charge e. 

Thus, the concept of charge density can also be used in the case 
of a point charge. In this case p is understood to be a function which 
is equal to zero everywhere except at the point of the charge. The 
volume integral of this function is either equal to the charge e itself, 
if the charge is situated inside the integration region, or zero, if the 
charge is outside the region of integration. 

The Law of Charge Conservation. One of the most important laws 
of electrodynamics is the law of conservation of charge: the total 
charge of any system remains constant if no external charges are 
brought into it. In all charge transformations occurring in nature, 
the law of conservation of charge is satisfied with extreme precision 
(while the law of conservation of mass is approximate). 

In order to formulate the charge-conservation law in differential 
form, we must introduce the concept of current density. This vector 
quantity is defined as 

j = pv (12.14) 

where v is the charge velocity at the point where the. density p is 
defined. The dimension of charge density is charge/cm 3 , and of 
current density, charge/cm 2 s (that is, the dimensions of charge pas¬ 
sing in unit time across unit area). In particular, for the point charge 
in Eq. (12.14), v denotes its velocity, and p the density function 
defined above. 

The total current emerging from an area is 

/ = { jdS 


(12.15) 
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According to the charge conservation law, I must be equal to the 
reduction of charge inside the surface in unit time: 

I- -■£ (12.16) 

As we have already done in other cases, we pass from this integral 
form of the charge-conservation law to the differential form. Substi¬ 
tuting e from (12.11) and transforming I by the Gauss theorem, we 
obtain 

j (-g- + div P v ) dF = ° (12.17) 

Since the volume over which the integration is performed is arbitra¬ 
ry, the charge-conservation law in differential form follows from 
(12.17): 

-jjf + div j — + div pv = 0 (12.18) 

Displacement Current. From direct-current theory it is known that 
current lines are always closed. Indeed, open lines indicate that 
there is either an accumulation or loss of charge at their ends. But 
we can also define vector lines such that they will always be closed 
or go to infinity in the case of alternating currents. For this we 
substitute the derivative dp/dt according to (12.13) into the equation 
of the charge-conservation law (12.18). This derivative is equal 
to (1/4ji) div ( dE/dt ). Hence, we always have the relation 

diT (i +TT"fr) “° < 1219 > 

Comparing (12.19) and (12.9), we see that the vector lines 

. . 1 dE 

^ ' 4jt dt 

are always closed. The vector (l/4n)(dE/dt) is called the displace¬ 
ment current, because it is not associated with the transfer of charge. 3 
Together with the charge-transport current, the displacement current 
forms a closed system of vector lines. 

So far all we have done is to rewrite in a somewhat different form 
the laws of electrodynamics known from elementary physics. Now 
we must introduce a substantially new assumption: the magnetic 
action of the displacement current differs in no way from the magne¬ 
tic action of the charge-transfer current. This was Maxwell’s assump¬ 
tion when he formulated the general laws of electrodynamics. 

The Equation for curl H. Equation (12.13) belongs to the second 
pair of Maxwell’s equations. Another equation of this pair defines 
curl H. Just as in the development of (12.13) we made use of the 


3 In future we shall use simply “current” instead of “current density”. 
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Coulomb law, here we shall require the law of Biot-Savart and the 
hypothesis of the magnetic action of the displacement current. 

First we write the Biot-Savart law in elementary form. Let a cur¬ 
rent element of length dl x (Figure 20) be located at a point whose 
radius vector is ri, the current strength being I. Then at a point 
with radius vector r this current produces a field dK defined by the 
law 


( 12 . 20 ) 

It is more convenient to deal with a spatially distributed rather 
than a linear current. If the current density is pv (neglecting, for 



Figure 20 


the time being, the displacement current), the intensity can be 
written as the total flux across an area dS: 

I = p (vdS) (12.21) 

Hence, to obtain the magnetic field of the whole distribution of 
currents, the fundamental law (12.20) must be integrated over a 
certain area crossed by the current and along the whole circuit dlii 

H = j p (V• dS) J (12.22) 

This yields the portion of the magnetic field due to the currents 
passing along the outer circuit enclosing this area (see Figure 20). 
If we add the displacement current (l/4n)(dE/dt) to pv, the vector 
lines of the resultant current will close. In Figure 20 these closed 
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lines cross the unhatched circuit. The points of this circuit and the 
surface stretched across it are defined by the radius vector r, while 
the points of the vector line of the total current and the correspond¬ 
ing hatched surface are defined by the radius vector ri. Note that 
if we hadn’t introduced the displacement current, the paths of 
point charges could at no given instant be closed: the current at 
that instant is not zero where there is a charge but it is zero at all 
other points in space. Fully closed lines, as in Figure 20, are ob¬ 
tained only together with the displacement current. 

Now let us integrate the expression for the magnetic field along 
the contour of the unhatched surface, that is, over d\. Then, from 
(12.22), after adding (1/4ji) (dE/dt) and rearranging the order of 
integration, we obtain 

J Hd _ ± J (pv+-£f-) ds J dl j (12.23) 


Let us now show that the inner double integral 


A = 



dljXir — Tj) 
I r —^ |3 


is equal to 4n, if the hatched contour and the vector line contour 
along which integration over d\ x is performed are connected; other¬ 
wise it is equal to zero. 

We denote differentiation with respect to the components of 
vector ri by the symbol Vi, and differentiation with respect to the 
components of r by the symbol V- Since the integrand depends only 
upon the difference r — ri, we can write symbolically Vi = —V- 
We also replace (r — r x )/ 1 r — r x | 3 by Vi I r — Ti | _1 : 


dli * Vi 7^rr) (12,24) 

We perform a cyclic permutation in the mixed product so that 
the element of length d\ x would appear after the sign of the vector 
product: 

A = J j dli ) (12.25) 


We have obtained an integral along the circuit dli to which 
Stokes’ theorem can be applied, so that 

A = j j curl, (v, i^vrr Xdt ) dSi (12.26) 

Here the integration is taken over the surface bounded by the cur¬ 
rent line. Using (11.30) and remembering that d 1 is not differen¬ 
tiated, we expand the curl of the vector product. As a result 
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div Vi | r — ri | _1 = Vj | r — ri | _1 vanishes, leaving only 

- j flV j (12.27) 

The quantity under the inner integral sign has a direct geometri¬ 
cal meaning. Draw a straight line from a point on the circuit d\ 
to its intersection with surface Si. The vector r — ri lies along this 
line. The product of its multiplication by cZSi divided by | r — r x | 
denotes the projection of the area dS 1 on a surface perpendicular to 
r — Hence, the expression 
1 (r-r^dSj 
I r i*! | a | r i*! | 

is a solid-angle element at which the area dS l is seen from the point 
on the circuit d\. The inner integral in (12.27) thus represents the 
total solid angle £2 at which the path described by the current line 
is seen from the point on the circuit dl. 

In writing (12.27) we replaced Vi by V and took it outside the 
integral over dS x . The product d\ V is the differential cZ£2 taken 
along the circuit l. Now let us calculate the integral along the cir¬ 
cuit Z. Suppose the current line was traced clockwise. Then the 
positive side of the surface S i stretched over this circuit lies under 
the page of Figure 20. If we look at the circuit of S i, as it is shown 
in the drawing, the scalar product (r — ri) cZSi is positive, and the 
solid angle must be taken with the plus sign. 

Let the initial point of traversing circuit l be at the intersection 
of the circuit with surface Si. We start the traverse on the side seen 
in the drawing. At this point the surface occupies the half-space 
seen from it, and therefore £2 = 2n. Moving along the circuit Z, we 
gradually reduce the solid angle, so that dQ < 0. On the reverse 
side of the surface Si the scalar product is negative, and the solid 
angle is equal to —2ji. The total variation of the solid angle is 
2ji — (—2ji), or 4ji, taking into account the minus sign in (12.27). 

This is the case of a circuit connected with the current line; for 
a circuit that is not connected with it the reversion is to the same 
point, with £2 = 2n, and the integral becomes zero. It follows that 
the integral (12.23) involves the contribution of only those current 
lines that are connected with the circuit Z, which is therefore equal to 

(pv + ^t)'® ( 12 - 28 ) 

Now, transforming the left-hand side of (12.28) according to 
Stokes’ theorem, and taking into account that the surface over 
which the integration is carried out is arbitrary, we arrive at Max¬ 
well’s equation for curl H: 

in 4jt . . 1 dE 

curl H = H- 77 - 

c J 1 c dt 


(12.29) 
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It will be readily observed that this equation is in agreement 
with the law of charge conservation. Indeed, let us perform the 
divergence operation on it. From (11.40), div curl H = 0, so that 
only the divergence of the right-hand side remains. If we substitute 
div E according to (12.13), we again arrive at (12.18), that is, to 
the law of charge conservation: 

div j +t 4" div E =( div j +^r) = 0 

Equation (12.29) is not just a way of writing the Biot-Savart 
law in differential form. It involves the displacement current, which 
does not appear in direct-current theory. 

Maxwell’s Equations. Let us once again write the set of Maxwell’s 
equations in the form derived from the fundamental laws of electro¬ 


magnetism. 

The first pair: 

curlE---1~£. (12.30) 

div H = 0 (12.31) 

The second pair: 

" lrlH -TT + i ?- < 12 ' 32 > 

div E = 4jtp (12.33) 


In these equations we consider p and j, that is, the time dependent 
charge and current distributions in space, to be known. The un¬ 
knowns to be determined are both electromagnetic field components, 
E and H. Each of them has three vector components. 

In spite of the fact that both pairs form, together, eight equations, 
only six of them are independent, according to the number of field 
components. Indeed, the three components of each curl are constrain¬ 
ed by div curl = 0, and, hence, are not independent of one another. 

It was shown in the first part of this book that the laws of me¬ 
chanics can be derived from certain symmetry laws and the principle 
of least action. This approach reveals more vividly the totality of 
experimental facts underlying Newtonian mechanics. We can ap¬ 
proach electrodynamics in a similar way. Then its fundamental 
laws will be found to be simple corollaries of very general regulari¬ 
ties. The very procedure of developing Maxwell’s equations is thereby 
greatly simplified. 

Electromagnetic Potentials. We can introduce new unknown quan¬ 
tities into the equations of electrodynamics so that each equation 
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contains only one unknown. In this way the overall number of equa¬ 
tions is reduced. These new quantities are called electromagnetic 
potentials. 

We choose the potentials so that the first pair of Maxwell’s equa¬ 
tions is identically satisfied. In order to satisfy Eq. (12.31), it is 
sufficient to put 

H = curl A (12.34) 

where A is a vector called the vector potential. Then, according 
to (11.40), div H will be equal to zero identically. The electric 
field should be represented in the form 

E== —r4r-e rad< P ( 12 - 35 > 

where q> is a quantity called the scalar potential. From (11.39), curl 
grad q> = 0. Substitution of Eqs. (12.34) and (12.35) into (12.30) 
leads to an identical cancelling out. 

The electromagnetic field vectors are the physically determinate 
quantities, insofar as they are involved in the expressions of forces 
acting on the charges and currents. The field strengths, or simply 
fields, as is conventionally said, are obtained from the potentials 
by differentiation. Therefore, the potentials are determined up to 
expressions that cancel out in the differentiation. It is natural to 
select the expressions in such a way as to make the potential equa¬ 
tions as simple as possible. Let us first find the most general trans¬ 
formation of potentials which does not change the fields in Eqs. (12.34) 
and (12.35). 

From Eq. (12.34) it can be seen that if we add the gradient of any 
arbitrary function to the vector potential, the magnetic field does 
not change, since the curl of a gradient is identically zero. Putting 

A = A' -f- grad /(r, t) (12.36) 

we see that the magnetic field, expressed in terms of such a modified 
potential, remains unchanged: 

H = curl A = curl A' 

In order that the addition of grad / to the vector potential should 
not affect the electric field, we must also change the scalar potential: 
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where / is the same function as in (12.36). Then for the electric field 
we obtain 



c 


UA i 

1*—£ rad( P 


1 dA' 
c dt 
1 dA' 
c dt 


T-k grad /_grad (p ' + grad T if 

grad cp' 


Consequently, the electric field does not change either. Thus, 
the potentials are determined to the accuracy of the transforma¬ 
tions (12.36) and (12.37), which are called the gauge transformations . 


The Equations for Potentials. Let us now choose an arbitrary func¬ 
tion such that the second pair of Maxwell’s equations leads to equa¬ 
tions for the potentials of the simplest possible form. Substitution 
of (12.34) and (12.35) into (12.32) gives 

curl curl A =——“4‘ grad<p+ "T L (12.38) 

We express curl curl A with the aid of (11.42). Then (12.38) is 
reduced to the following form: 

-V>A + ^^- + gr«d(diTA+-l-i)—(12.39) 

We shall now try to eliminate the quantity inside the parentheses 
in the left-hand side of (12.39). We denote it for brevity by a, and 
we perform the transformations (12.36) and (12.37) on the poten¬ 
tials. Then the quantity a is reduced to the form 

a = dWA + i--J_ dly A' + X^ + VV-4.0 (12.40) 

The function / has, so far, remained arbitrary. Let us now assume 
that it has been chosen so as to satisfy the equation 

V 2 /-4r-g-=-° (12.41) 

Then, from (12.40), it is obvious that the potentials will be sub¬ 
ject to the condition 

divA' + -^-i|l = 0 (12.42) 

This is called the Lorentz condition . 

As was shown, the expression of fields in terms of potentials is 
not changed by gauge transformations. For this reason we shall 
in future always consider that these transformations are performed 
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so that the Lorentz condition is satisfied; the primes in the poten¬ 
tials can then be omitted. 

From the Lorentz condition and (12.39) we obtain the equation 
for the vector potential: 


V 2 A 


c a 


d 2 A 

dt 2 


4jij 

c 


(12.43) 


It is now also easy to obtain the equation for the scalar potential. 
From (12.33) and (12.35) we have 

div E = —div A — V 2 cp = 4np 


Substituting div A from the Lorentz condition (12.42), we obtain 


v2( p--?"^ = - 4ll p 


(12.44) 


Equations (12.43) and (12.44) each contains one unknown. There¬ 
fore, none of the equations for potential depend on the others, and 
they can be solved separately. This is valid, however, only in Carte¬ 
sian coordinates; in curvilinear coordinates different components 
of A are involved in the same equations. 

The equations for potentials are second-order equations with res¬ 
pect to the coordinate and time derivatives. To solve such equations 
the initial values must be stated not only of the potentials but of 
their time derivatives as well. 

As we shall see later, in very many cases it is necessary to use 
equations involving not the electromagnetic fields but the poten¬ 
tials that define them. But since potentials are not single-valued 
and may, for the same electromagnetic fields, receive different 
supplementary terms according to Eqs. (12.36) and (12.37), care 
must be taken that the form of any equation involving potentials 
does not change in gauge transformations. The thing is that these 
transformations involve a completely arbitrary function /, which 
can be chosen in any form. Obviously, this requirement does not 
refer to the relations defining the potentials, that is (12.43) and 
(12.44), the validity of which is connected with the choice of poten¬ 
tials satisfying the Lorentz condition. 

No physical result can be dependent on the choice of an arbitrary 
function / on which no preliminary restrictions were imposed. The 
Lorentz condition was selected only for the purpose of simplifying 
(12.43) and (12.44); it is not physically necessary. 

It is said that physical equations, that is, equations for directly 
observable quantities, must be gauge invariant. 
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EXERCISE, 

Write Maxwell’s equations and the equations for potentials in cylin¬ 
drical coordinates, for which dl 2 = dr 2 + r 2 dty 2 + dz 2 . 


13 


EINSTEIN’S RELATIVITY PRINCIPLE 

Addition of Velocities and Electrodynamics. Maxwell’s equations 
involve the constant c, which has the dimensions of velocity. It will 
be shown in Section 18 that c not merely happens to have the same 
dimensions as velocity but is in fact the speed with which electro¬ 
magnetic waves propagate in vacuum. This result is obtained from 
a solution of the set (12.30)-(12.33), given the assumption that p = 0, 
3 — 0. 

Suppose the quantities involved in these equations have been 
measured by an observer subject to inertial motion. Suppose, fur¬ 
thermore, that there is another observer moving relative to the 
first at a constant speed V. In mechanics the principle of relativity 
states that any equation must have the same form relative to dif¬ 
ferent inertial frames of reference. In what conditions can this be 
valid in electrodynamics? Can the laws of electrodynamics be of 
the same form in the frame of reference of an initial observer, in 
which electromagnetic waves propagate in all directions with the 
speed c, and in the frame of another observer moving with a constant 
velocity V with respect to the first frame? 

At first this seems impossible. According to the law of addition 
of velocities, in the second observer’s frame of reference electro¬ 
magnetic waves should propagate with a velocity 

c' = c + V (13.1) 

where we have assumed, for the sake of simplicity, the waves to be 
propagating in the direction of the relative velocity of the reference 
frames. Reversal of the motion would reverse the sign in the equa¬ 
tion; for the case of perpendicular motion the velocities would have 
to be added as a vector sum. 

Thus, in the second observer’s reference frame, which is also 
moving inertially, the propagation speed of electromagnetic waves 
would have to depend on direction. But since a travelling electro¬ 
magnetic wave is one of the possible solutions of Maxwell’s equa- 
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tions, they would have to be of different form in different inertial 
frames of reference. 

It follows that only one of two statements can be true: either the 
conventional law of addition of velocities (13.1) is valid, and the 
relativity principle does not apply to electromagnetic fields, or 
the principle is valid for electromagnetism, and addition of velo¬ 
cities does not hold in its simplest form (13.1). 

The Experiment of Michelson. The validity of the second assump 
tion, that is, that the speed of light and, in general, of any electro¬ 
magnetic disturbance in vacuum does not combine with the velocity 



of a reference system, was demonstrated experimentally by Michel¬ 
son in 1887. He showed that the velocity of light as measured in any 
inertial frame of reference is the same and equal to the fundamental 
constant c. Here is a brief description of the experiment. 

A beam of light falls on a half-silvered mirror SS (Figure 21), 
where it is split in two: one part is reflected and falls on mirror A , 
the other passes through and falls on mirror B. Let beam SA be 
perpendicular to the velocity of the earth in its motion around the 
sun; then beam SB is parallel to that velocity. The light reflected 
from mirrors A and B returns to mirror SS; beam BS is reflected 
and falls on screen C, while beam AS passes through SS and falls 
on C directly. Both rays are thus entirely equivalent as regards 
their passage through, and reflections from, SS , but on sections AS 
and BS the light propagates differently relative to the earth’s 
motion. 

Let us see what effect could be expected if the velocity of light 
was compounded with the velocity of the earth according to the 
conventional law. Along the path SB the velocity of light relative to. 
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the earth would be c — F, and in the reverse direction c + F, where 
F is the velocity of the earth. In the assumption made, the time 
it takes for the light to travel the whole path SBS in both direc¬ 
tions is 


l l 2Ic _ 21 21V 2 

c+V c — V ~~ c 2 — V 2 ~ c C 3 

where l = SB . In going over to the approximate equation we made 
use of the fact that V <^c. 

Along section SA the velocities of the earth and light are perpen¬ 
dicular (in the reference frame fixed with respect to the apparatus). 
Assuming again that the velocity of light is added to that of the 
earth, we must now apply the vector law of addition. Then along 
section SA the velocity of light relative to the apparatus is equal 
to (c 2 — F 2 ) 1 / 2 , c being the hypotenuse of the right triangle, and F 
and (c 2 — F 2 ) 1 / 2 , its sides. The time it takes light to travel the whole 
path SAS 1 equal to 2Z, is 

21 ~ 21 IV 2 

( c 2 _y 2 jl /2 ^ c ' 

Thus, the difference between the times it takes light to travel 
along the paths SBS and SAS is equal to ZF 2 /c 3 . By means of mul¬ 
tiple reflections the paths travelled by the beams are made fairly 
long (tens of metres). By judicious manipulation of the paths the 
expected difference between the times it takes light to travel along 
SAS and SBS can be made equal to the half-period of the oscilla¬ 
tion of light. Then, if our reasoning was correct, the beams on the 
screen should cancel out. 

In order to make sure that the cancellation of the beams at a given 
point of the screen is due to the addition of the velocities of the 
earth and the light beams and not to some other causes, it is suf¬ 
ficient to rotate the apparatus through 45° so as to direct the velo¬ 
city of the earth along the bisector of angle A SB. In this case the 
difference in time between the passage of the beams along SAS and 
SBS should in any case be zero, provided the difference in the initial 
position amounted to one-half the oscillation period. In other words, 
the interference fringes on the screen should move by half the dis¬ 
tance between them, the light and dark areas interchange on the 
screen. 

Actually, no change in the path difference of the beams in a rota¬ 
tion of the apparatus is observed, that is, the expected effect does 
not occur. The velocity of light is not added to the velocity of the 
earth. 

Below we shall examine certain facts which would appear to point 
to a reverse conclusion and demonstrate that the contradiction is 
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anjimaginary one. First we shall establish the corollaries that arise 
from the fact that addition of the velocity of light to the velocity 
of the earth does not occur. 

The Relativity Principle Applied to Electromagnetic Field. As 
pointed out in the preceding section, the equations of electrodyna¬ 
mics do not presume the existence of an elastic medium—the “ether”— 
for the transmission of electromagnetic disturbances. The electro¬ 
magnetic field is itself a reality, and it can therefore be expected 
that the equations of electromagnetics are just as independent of 
the choice of an inertial reference frame as the equations of mechan¬ 
ics. Both sets of equations directly describe motion, that is, the 
time rate of change of state of the observed entity. The form of the 
equations should not change depending on the chosen inertial frame 
of reference. 

That is why the result of Michelson’s experiment, far from contra¬ 
dicting the notion of the relativity of motion, confirms it. Michelson’s 
experiment proves that the speed of light in vacuum is the same 
in all inertial frames of reference. 

The velocity of propagation of interactions is a fundamental 
constant in the equations of electrodynamics. These equations can 
be invariant under transformations from one inertial system to 
another only when the velocity of propagation of interactions is 
the same in both systems. The result of Michelson’s experiment 
contradicts only the law of addition of velocities, that is, the Gali¬ 
lean transformation (8.1). 

This transformation law, and the consequent law of addition 
of velocities, is experimentally confirmed only in the case of relative 
velocities and velocities of motion that are small in comparison 
with the speed of light c. It follows from Michelson’s experiment 
that for reference frames moving with high relative velocities and 
entities travelling at high speeds the Galilean transformation must 
be replaced by other, more exact transformations. Moreover, these 
transformations must be universal in form and the same for both 
particles and electromagnetic fields. 

Indeed, let the charges in some specified inertial frame of refer¬ 
ence interact in some way with an electromagnetic field, leading 
to certain events, for example, collisions of charges. Such events 
can be predicted on the basis of the equations of mechanics and 
electrodynamics. In a transformation to another inertial frame the 
equations of mechanics and electrodynamics must retain their form, 
as otherwise other effects would be observed, notably it could be 
found that the collision does not take place altogether. But colli¬ 
sions are objective facts which must be observable in all reference 
frames. Yet if we retain the Galilean transformations, even only 
for mechanics, and declare that the law of addition of velocities 
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consequent on them does not apply to electromagnetic fields, a dis¬ 
crepancy arises which is the greater the closer the velocity of the 
inertial system (or the velocities of motion of individual particles) 
approaches the speed of light. 

We must replace the Galilean transformation by such transforma¬ 
tions that would leave the equations of both mechanics and electro¬ 
dynamics the same. However, in the process we find that the Newto¬ 
nian laws of mechanics, valid only for small velocities of particles, 
have to be made more precise. 

This calls for a revision of the laws of mechanics, however great 
our faith in them, based as it is on daily experience, with bodies 
whose speeds are small in comparison with the speed of light. 

The Lorentz Transformation. We shall look for transformations of 
a more general form than the Galilean transformation for passing 
from one inertial frame of reference to another. Like the Galilean 
transformation, they must satisfy certain requirements of a general 
nature, listed below. 

(i) The transformation equations must be symmetrical with res¬ 

pect to both inertial frames. We shall denote the quantities refer¬ 
ring to one frame by unprimed letters, and those referring to the 
other, by primed letters. Thus, we must obtain equations express¬ 
ing x , y, 2 , and t in terms of x f , z', and t' in such a way that the 

primed quantities are expressed in terms of the unprimed ones, 
or the unprimed are expressed in terms of the primed ones, by equa¬ 
tions of the same form. 

Denoting the velocity of the primed frame relative to the un¬ 
primed frame F, we find that the direct transformation equations 
must then transform into the reciprocal equations by a simple sub¬ 
stitution of —F for F. This requirement is essential for the equiva¬ 
lence of both frames. 

(ii) The transformation must convert any points in one frame 
located at a finite distance from the origin into points that are also 
at finite distances from an arbitrary origin in the other frame. 

The first requirement greatly restricts the possible form of the 
transformations. For example, the transformation functions cannot 
be quadratic, because the inversion of a quadratic function leads 
to an irrationality, the same as an inversion of a function of any 
power but the first. A linear-fractional transformation, that is, 
the quotient of two linear expressions, can, under certain limita¬ 
tions imposed by the coefficients, be inverted with the help of a func¬ 
tion of the same form. For example, for one variable, the direct 
and inverse linear-fractional functions look like this: 

, ax-\-b b — fx' 

X = - r~Ti X = T 1 — 

ex-\-f ex —a 


11—0452 
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But these functions do not satisfy the second condition: if x ' = 
= ale , then x becomes infinite. Therefore only a linear function is 
possible. 

(iii) When the relative velocity of the two frames tends to zero, 
the transformation equations yield an identity: x = x', y = y ', 
z = z , and t = t f . 

(iv) The transformation equations yield a law of addition of veloc¬ 
ities that leaves the velocity of light in vacuum invariant. 

Summing up, we state briefly that the transformation equations (i) 
retain their form when inverted, (ii) are linear, (iii) become identi¬ 
ties for small relative velocities, and (iv) leave the velocity of light 
in vacuum unchanged. 

These four conditions are sufficient. To simplify the calculations, 
let us direct any two coordinate axes, say x and x\ along the relative 
velocities of both systems. Then the coordinates along the other axes 
will not be affected by the transformation, that is, y — y\ z = z . 

Returning to Figure 9, we shall not make the arbitrary assumption 
that t — t' (it is experimentally confirmed only at small relative 
velocities of the systems). Then the linear transformations of x 
and t can, in the most general form, be expressed as follows: 


x f = ax + P* (13.2) 

t r — yz + 8t (13.3) 

The coefficients a, p, y, 6 are determined from conditions (i)- 
(iv). There is no need to write the constant terms in these equations: 
they can be included in x or x' by appropriate choice of the origin 
of the reference frame. 

Let us apply Eq. (13.2) to the origin of the primed frame of refer¬ 
ence, x' = 0. This point moves with velocity V relative to the 
unprimed frame. Hence, x = Vt. Substituting x r = 0, x = Vt 
into (13.2), we obtain, after eliminating t , 

ccF + p = 0 (13.4) 


We shall solve equations (13.2) and (13.3) with respect to x and t. 
Elementary algebraic computations give 


6 — 
a6 —Py 
yx' — at' 
Py —a6 


(13.5) 

(13.6) 


Let us now apply condition (i). For this we note that the coeffi¬ 
cients P and y, which interrelate the coordinate and time, must 
change sign together with the velocity F. Otherwise, if the x and x r 
axes are turned in the opposite direction, the equations will not 
preserve their form, and this is impermissible. A transformation 
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from x to —x and from x' to —x' is equivalent to a transformation 
from V to — V; for Eq. (13.2) to remain unchanged we must require 
that the sign of (S reverses together with that of V. This also agrees 
with (13.4). 

Thus, it is necessary for the equations of the inverse transforma¬ 
tion from x f to x to differ from the direct transformation, (13.2) 


and (13.3), in the signs of |5 and y: 

x = ax' — p*' (13.7) 

t = - yx ' + 6 t' (13.8) 

Comparing (13.7) and (13.5), we obtain 

-P—(«10) 


From (13.10) it follows that 


a6 — Py = 1 


(13.11) 


Then (13.9) yields 

a = 6 (13.12) 

This is all that is necessary for symmetry between the direct and 
inverse equations. 

We now use condition (iv). We divide Eq. (13.2) by (13.3): 


x' ax/t-\- p 

t' yx/t-\-& 


(13.13) 


Let x be a point occupied by an electromagnetic signal emitted 
from the origin of the unprimed frame at an initial instant of time 
t = 0. Obviously, x/t = c. But in accordance with condition (iv), 
x'/t' = c . Hence 


occ+P 

yc+6 


(13.14) 


We substitute the relations (13.4) and (13.12) into (13.14) in 
order to eliminate p and 6. There remains a relation between a 
and y: 

yc 2 + olc = ac — aV 

whence 


Y = 


V 


— a- 


(13.15) 


11 * 
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Substituting (13.15), (13.4), and (13.12) into (13.11), we obtain 
the equation for a: 



(13.16) 


In extracting the square root we must take the positive sign in accord¬ 
ance with condition (iii), because then (13.3) becomes t' — t for 
a small relative velocity. (Otherwise we would obtain t' = — t, 
which is meaningless.) 

Now expressing all the coefficients a, p, y, and 6 in accordance 
with Eqs. (13.16), (13.4), (13.15), and (13.12), and substituting 
into (13.2) and (13.3), we arrive at the required transformations: 


x — Vt 

(1 — V 2 / c 2)1/2 

t — Vxjc 2, 

(1 —V’2/C 2 ) 1/2 


(13.17) 

(13.18) 


These equations are called the Lorentz transformations. From (13.7) 
and (13.8) the inverse transformations are of the form 


x’ + Vt’ 

(1 — V 2 /c 2 ) 1/2 
t' -\-Vx'/c 2 

(l-F2 /c 2)i/2 


(13.19) 

(13.20) 


In order to explain the meaning of these equations we shall apply 
them to some special cases. Let a clock be situated at the origin 
x' = 0 of the primed frame. It indicates a time t '. Then, from 
Eq. (13.20), it follows that 


(1 —F 2 /c 2 ) 1/2 


(13.21a) 


We shall call the clock at rest relative to the observer’s reference 
frame the observer’s clock. From formula (13.21a) we see that an 
observer comparing his clock, showing time t, with the clock of 
another observer must always conclude that the latter’s clock is 
slow, that is, that t f < t . If the clock is at rest at the origin of the 
unprimed frame of reference, that is, at point x — 0, the transfor¬ 
mation formula is of the same form, since from (13.18) 


(1 — F 2 /c 2 ) 1/2 


(13.216) 


This not only does not contradict (13.21a) but expresses precisely 
the same fact: a clock moving relative to a certain observer lags 
behind his clock. 
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In relativity theory there is no single universal time as there is 
in Newtonian mechanics. It is better to say that the absolute time 
of Newtonian mechanics is an approximate concept, valid only at 
small relative velocities of the clocks being compared. The absolute 
nature of Newtonian time has sometimes given cause to regard it as 
an a priori, logical category independent of the motion of matter. 
But it should be remembered that in Newtonian mechanics the 
approximate concept of absolute time does not lead to contradic¬ 
tions, because action at a distance is assumed to be instantaneous. 
In formula (13.18) it is sufficient to put c = oo to obtain t = t'. 
In Newtonian mechanics instantaneous action was transmitted at 
a distance by gravitational forces. 

It is sometimes assumed that, knowing the velocity of light c, 
it is possible to introduce a correction to the readings of clocks in 
different inertial systems for the passage of time to be always the 
same. But that is just what Eqs. (13.21a) and (13.216) do: they 
describe the passage of time in two reference frames after introducing 
the correction for the finiteness of the propagation time of light. 
Time dilation , as it is called, is fully reciprocal in these reference 
frames and cannot, therefore, be ascribed to some change in the 
properties of the clocks associated with motion. Time dilation is 
a purely kinematic effect. 

The relativity of time does not imply rejection of its objective 
nature. It is objective for every reference frame, just as the direction 
of a plumb line, which is different at different points of the globe, 
is an objective reality for each of them. Yet there was a time when 
the vertical direction was thought to be absolute. 

Also relative, we find, is the concept of length of a line segment. 
In order to determine the length of a moving body—a “measuring- 
rod”—the coordinates of its ends must be simultaneously plotted 
in a fixed reference frame. There is no other basically different 
method of measuring a moving measuring-rod at the disposal of 
a stationary observer, since otherwise he would have to bring it 
to a halt, that is, transfer it to his own reference system. He must 
make the “notches” of the ends of the moving measuring-rod simul¬ 
taneously according to his clock, say at the same instant t = 0. 
The concept of the simultaneity of two operations performed in the 
same reference frame can be uniquely defined with the help of light 
signals. Indeed, observers at rest relative to one another can always 
synchronize their clocks with the help of a light signal, making 
the necessary correction for the propagation time. 

Substituting t = 0 into (13.17), we obtain the expression for the 
length of the moving measuring-rod relative to a stationary one: 


Ax' = 


Ax _ 

(l_y2 /c 2)i/2 


(13.22) 
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If the observers exchange roles and the one moving with the mea¬ 
suring-rod measures the “unprimed” observer’s rod, a similar formula 
results, in which Ax' occurs in the right-hand side, and Ax in the 
left-hand side. Both equations relating the lengths of the stationary 
and the moving measuring-rods express the same fact: the moving 
rod contracts relative to the stationary one. 

Note also that if, in an imaginary experiment, we undertook to 
photograph two fast-moving objects, no length contraction would 
be observed, because the light beams from the ends reaching the 
camera are not emitted simultaneously. An analysis reveals that 
in a snapshot the moving body would appear foreshortened as if 
viewed at an angle. Of course, making a photograph of such fast- 
moving bodies is, at present, as speculative as viewing them with 
the unaided eye. 

Addition of Velocities. We shall now find the equation for the 
composition of velocities arising from the Lorentz transformations. 
Differentiating (13.17) and (13.18) and dividing one by the other, 
we obtain 


dx r _, _ dx/dt — V ' _ u x — V 

lF = Vx ~ 1 - (F/c 2 ) (dr/dt) = i — Vv x /c* 


(13.23) 


Noting that dy' = dy and dz’ = dz, we have a transformation 
of the velocity components perpendicular to V : 


dy’ _ , (dy/dt) (1 —F a /c 2 ) 1/2 _ v y (1 —F 2 /e ! ) 1/21 

dt' — v v— 1 — (V/c*) (dx/dt) ~ 1 — Vv x /c* 

Ml-FVc *) 1/2 

z i-Vv x lc* 


(13.24) 


For small velocities, (13.23) and (13.24) become the ordinary equa¬ 
tions for addition of velocities. This can be seen if we let c tend to 
infinity, that is, by putting Vic = 0. 

It is easy to see that if u = {v% + Vy + uj) 1/2 = c, then likewise 
v' = c, that is, the absolute value of the velocity of electromagnetic 
perturbations does not change in passing from one inertial frame 
to another. But the separate components of the velocity of light, 
which are less than c, may of course change, just as the direction 
of a light ray relative to different observers may vary, since there 
is no absolute direction in space. 

In this connection let us consider the phenomenon of the aberration 
of light. Astronomical aberration, or the deflection of light, consists 
in the fact that the stars describe small ellipses in the sky in the 
course of the year. The origin of this phenomenon is easy to explain: 
the velocity of the earth in its annual motion combines differently 
with the velocity of the light emitted by a star (Figure 22). If the 
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velocity vector of the light from the star relative to the sun is ES, 
then the resultant direction of the velocity for one position of the 
earth is ET U and in half a year it is ET 2 . These directions are pro¬ 
jected on different points of the celestial sphere, so that in the course 
of the year the star describes a closed ellipse. In the direction per¬ 
pendicular to the plane of the earth’s orbit both axes of the ellipse 
are equal, and we have a circle, while in the plane of the orbit the 
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Figure 22 


ellipse becomes a straight line equal in length to the diameter ol 
that circle. In other words, the semi-major axis of the ellipse is 
always equal to V/c., where V is the velocity of the earth: 


V _ 1 

c “ 10 000 


20.63* 


But why, we may ask, doesn’t the velocity of light in Michelson’s 
experiment combine with the velocity of the earth and remains 
equal to c, while the phenomenon of aberration shows that these 
velocities do combine? To approximate the conditions of Michelson’s 
experiment more closely with the observed aberration, the experi¬ 
ment was carried out with an extraterrestrial light source which T 
of course, did not change the result. The apparent contradiction 
is explained by the fact that in Michelson’s experiment it is the 
absolute velocity of light that is measured according to the path 
difference of the beams, whereas in aberration the change in the 
direction of the velocity of light is due to the changing direction 
of the velocity of the earth along its orbit. If we take, for example, 
a star located in a direction perpendicular to the earth’s orbit, then 
in (13.23) and (13.24) we have u x = 0, v y = c, v z = 0. The compo¬ 
nents of the velocity of light relative to the earth are 

y«=-F, v' v = c{\ — F 2 /c 2 ) 1/2 
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And, in accordance with Michelson’s experiment, v'% + v \ = c 2 . 
The direction of the projection of the velocity of light on the plane 
of the earth’s orbit (the ecliptic) is reversed in the course of half 
a year, which is why aberration occurs. 

Similar equations are obtained in the more complicated case 
when the light rays are not perpendicular to the plane of the ecliptic. 
They coincide fully with the conventional equations for the addi¬ 
tion of velocities only when terms of the order V^/c 2 are neglected. 

Another contradiction with Michelson’s experiment was thought 
to be demonstrated in the Fizeau experiment, designed to determine 



the velocity of light relative to a moving medium. The Fizeau method 
was this. A beam of light was divided into two parts using a half- 
silvered mirror (Figure 23). These beams were passed through tubes 
with flowing water, one beam in the direction of flow and the other 
in the opposite direction. For comparison, the same beams were 
passed through the tubes with the water at rest. By subsequent 
reflections the beams once again combined and cancelled each other 
when the path difference between them was equal to an integral 
number of half wavelengths (that is, when they were in opposite 
phase). Coherence between them was obtained due to the fact that 
they both came from the same source. In still water, the path dif¬ 
ference was chosen so that the rays were reinforced, that is, the 
phase difference was equal to an integral number of wavelengths. 
The path difference in flowing water was varied. Since the frequency 
of the light and the tube lengths remained unchanged, the change 
in path length indicated .hat the speed of light in flowing water 
differed from that in still water. 

First of all, we note that the result of the Fizeau experiment in 
no way contradicts the general ideas about the relativity of motion. 
A reference system fixed in flowing water is not equivalent to a sys¬ 
tem fixed in the tube, if we are studying the propagation of light 
in water. 
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Since the velocity of light in water is equal to c/n, where n is 
the refractive index of the water, the general equation for the addi¬ 
tion of velocities (13.23) shows that c/n does not remain a constant 
quantity when passing to another reference frame. At the same time 
we cannot use the simple velocity-addition equation, because the 
denominator of Eq. (13.23) differs from unity by V/(nc), that is, 
a quantity of the same order as the quantity in parentheses in the 
numerator, if it is represented as (c/n) (1 — Vn/c), where V is the 
velocity of the water. Assuming that V <C c, and expanding the 
denominator in a series up to the linear term inclusive, we find 
the change in the velocity of light in flowing water (see Exercise 1): 

This does not quite coincide with the result that could be expected 
if we applied the conventional formula for the composition of veloc¬ 
ities. When Fizeau performed his experiment (in the mid-nine¬ 
teenth century), the result proved to be somewhat unexpected. 
Relativity theory explained the appearance of the factor (1 — 1 !n 2 ). 
And since Michelson measured the quantity c, and Fizeau the quan¬ 
tity c/n , there is no contradiction between their experiments. It 
should also be noted that Michelson’s experiment detected a quad¬ 
ratic effect (or rather its absence!), while the Fizeau experiment 
dealt with an effect linear with respect to V. That is why in Michel¬ 
son’s experiment the earth’s velocity of 30 km-s -1 , which is much 
greater than the speed of water flowing through a pipe, is used. 

The Interval. Despite the fact that x and t are changed separately 
by the Lorentz transformations, we can construct a quantity which 
remains invariant (unchanged). It is easy to verify that this pro¬ 
perty is possessed by the difference c 2 t 2 — x 2 . Indeed 

^2 _ c 2 t' 2 + FV2/C 2 + 2 Vx’t’ 

C 1 ~ 1 — V 2 /c 2 

2 x’*+v*t'*+2Vx't' 

X — 1 — F2 /c 2 


or 


cH 2 — x 2 = cH' 2 — x 2 = s 2 (13.25) 

The quantity s is called the interval between two events : that which 
occurred at the coordinate origin x = 0 at the initial time t = 0, 
and another event that occurred at the point x at time t. 

The word “event” may be considered in its most common every¬ 
day sense, provided that its coordinates and time may be defined. 
If the first event is not related to the origin of the coordinate system 
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and the initial instant, then 

s 2 = c 2 (t 2 — t t ) 2 — (x 2 — x t ) 2 = c 2 (f' — 1[) 2 — (x 2 — x[) 2 (13.26) 

Considerable importance is attached to the interval between two 
infinitely close events. We shall assume that they are separated by 
a length segment oriented arbitrarily rather than with respect to 
the x axis. Then an infinitesimal interval ds between the two events 
is defined as follows: 

ds 2 = c 2 dt 2 — dx 2 — dy 2 — dz 2 

= c 2 dt' 2 —dx' 2 — dy' 2 — dz' 2 = c 2 dt' 2 —dl' 2 (13.27) 

The interval thus written is not related to any definite direction 
of the relative velocity of the reference frames. 

Spacelike and Timelike Intervals. The interval concept provides 
an extremely vivid method for studying various possible space-time 
relationships between two events. Let the spatial distance between 



the points in which the events occurred be taken along the abscissa, 
and let the time interval between them be plotted along the ordi¬ 
nate (Figure 24). 

Imagine the case of ct > Z, i. e. s 2 > 0. If the same two events 
are considered in different inertial reference frames, the time inter¬ 
vals and spatial distances between them will be quite different, 
but the interval s 2 = c 2 t 2 — l 2 remains invariant in all reference 
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frames. It follows that the locus of all possible spatial distances l 2 
and time intervals (ct) 2 is an equilateral hyperbola, s 2 = c 2 t 2 — l 2 . 
One branch of the hyperbola lies in the past relative to the event 
that occurred at time t = 0, x = 0, while the other is completely 
in the future. It is easy to see that such a relationship is inevitable 
if the events are causally related. Let two events have occurred at 
the same spatial point in some reference system, the second event 
being an effect of the first. To this reference frame there correspond 
a point 0 (cause) and a point A (effect). But all the points of the 
hyperbola through A lie at t > 0, so that the cause precedes the 
effect in any reference frame. 

We may also proceed from causally related events taking place 
at different spatial points in the initial reference frame, for example 
the firing of a shot and the hitting of a target. But in the frame fixed 
relative to the bullet both events are located on line OA (Figure 24). 
In the initial frame of reference the firing of the bullet and the hit¬ 
ting of the target lie on an inclined line drawn from the origin to the 
same hyperbola passing through A. Therefore in any reference frame 
the target is hit after the shot. 

If we denote the velocity of the bullet (or of any particle) y, then 
it is easy to see that v < c. Indeed, for a reference frame fixed rela¬ 
tive to the particle to exist we must assume that ds* ^ c 2 dt 2 > 0. 
But then dl 2 = v 2 dt 2 < c 2 dt 2 , or v < c, as was asserted. The 
velocity of light is the limiting velocity for a material particle. 

The domain above the first asymptote is called the active future 
with respect to the initial event. If the cause is an event at point O , 
then every effect lies in that domain. Thus, the theory of relativity 
does not contradict the objective nature of causality. 

Other examples of spatio-temporal configurations of events can 
be offered, for which c 2 t 2 < l 2 and s 2 < 0. Such events can in no way 
be causally related. As we have seen, the speed with which matter 
is transported cannot exceed c , while to have s 2 < 0 we must put 
l 2 > c 2 t 2 . There does not exist a reference frame in which both these 
events could occur at the same spatial point. For them s 2 < 0, so 
that the interval is an imaginary quantity. 

However, the time sequence of such events is not defined: there 
are reference frames in which the event arbitrarily described as the 
first occurred before the second event, and reference frames in which 
the second event occurred before the first. Thus relativity theory 
denies the absolute nature of the simultaneity of two events the 
interval between which is imaginary. There is no reference frame 
in which they could have occurred at the same point in space. An 
example are points O and B in Figure 24. Point B lies on a hyperbola 
belonging partly to the future and partly to the past. It is clear 
that O and B cannot be causally related, since otherwise the inter¬ 
action would have to propagate from O to B instantaneously. Hence 
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there exist reference frames in which event B occurred before O, 

The domain between the asymptotes on the side of B is called 
the neutral region with respect to the initial event. 

The asymptotes themselves are of special interest: on them l = ct r 
5 = 0. The relationship l = ct occurs in the case of two events con¬ 
nected by an electromagnetic signal, for example the dispatch and 
reception of a wireless message. For two such events, s = 0 in all 
reference frames, because the speed of light is invariant, and we 
must always have l = ct. Since the graph in Figure 24 actually 
refers not to a plane but to 3+1 dimensions (the three spatial 
coordinates and time), the locus of zero intervals is described as the 
light cone. 

Proper Time. Closely related to the interval concept is that of the 
proper time of a particle. By definition, the displacement of a particle 
relative to a reference frame fixed with respect to it is zero. The 
reference frame is not necessarily inertial if the particle is moving 
with acceleration. Time measured in the particle’s own system is, 
apparently, expressed in terms of the interval as 

d * 0 = T - = ( d * 2 - + ) 1/2 = <« ( 1 - $■) 1/2 ( 13 - 28 ) 

Here, dt and dl represent the time interval and displacement of the 
particle relative to a reference frame not associated with it. From 
(13.28) or (13.21), the proper time of a particle travelling with 
velocity u = dlldt is the shortest. For finite time intervals we obtain 

t 0 = [ +l- + ) 1/2 < j dt (13.29) 

or 

t 0 < t (13.30) 

It may seem that (13.29) and (13.30) contradict what was said 
concerning the reciprocal nature of time dilation for two observers. 
Actually, to measure the difference in times yielded by these equa¬ 
tions the moving reference frame must be brought to rest relative 
to the stationary one, that is, it must be deprived of its inertial 
character. Such a noninertial frame is in no way equivalent to an 
inertial one. 

The foregoing can also be explained by means of the following 
reasoning. Let a reference frame be accelerated during a certain 
time interval x; then it travels uniformly for a very long time t ; 
then its direction of motion is reversed during time 2x; then it again 
travels uniformly for a long time t ; and finally it slows down to rest 
in time x. This, of course, is a description of a round trip. 

According to Eq. (13.29), the time dilation occurs mainly during 
the period of uniform motion, since t x. If that period is increased, 
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say, tenfold, while the acceleration and deceleration times remain 
the same, the time dilation is also tenfold greater. It follows from 
this that the acceleration and deceleration are needed only to com¬ 
pare the passage of time in the two reference frames, the inertial 
and the noninertial, but not to effect time dilation in the noniner- 
tial frame. Noninertionality is, so to say, a tag on the system that 
makes it possible to single it out as the one in which less time was 
seen to have passed between the departure and return from the 
trip. 

Mathematically this can also be explained as follows: dt is a total 
differential, while dt 0 = ds/c is not a total differential. Therefore 
the value of the integral (13.29) depends on the function v ( t) sub¬ 
stituted into it. We can discern here an analogue between the lengths 
of a broken or a curved line and the difference between the coordi¬ 
nates of its end points. Coordinates are a definite quantity, while 
the length of a path depends upon the form of the curve. 

Time t 0 is the reading of the clock in the moving reference frame. 
It defines the rhythm of all physical (and physiological) processes 
in that frame. This can at present be directly verified experimentally 
only with regard to the decay time of elementary particles, but even 
without direct verification there is no doubt at all about the validity 
of the conclusion. 

The mean lifetime of a positive Jt-meson, whose mass is equal to 
276 electron masses, before it decays into a p,-meson, or muon, with 
a mass of 206 electron masses, and a neutral particle, is 2 X 10~ 8 s 
(negative ax-mesons are usually captured by nuclei before they have 
time to decay). This time has been measured for ax-mesons brought 
to rest in matter, that is, it is their proper time. If the relationship 
(13.30), which expresses time dilation, did not exist, a fast jx-meson 
would have the same lifetime relative to a stationary reference frame 
fixed with respect to the earth. In that case it would be able to travel 
no more than 2 X 10“ 8 X 3 X 10 10 = 600 cm through air, because c 
is the limiting velocity of motion. Actually the mean path of jx-me- 
sons is much longer, because their lifetime measured in a stationary 
reference frame can be much longer than in their proper reference 
frame. 


Tensor Notation. In Section 11 it was pointed out that, besides 
possessing the same dimensionality on both sides, a correct physical 
equation must also be invariant with respect to rotations of the 
coordinate system. In other words, only such quantities may occur 
on both sides of the equation which transform similarly in passing 
to another coordinate system. For this to be apparent from the 
equation itself it is convenient to write it in vector or tensor form. 
Furthermore, every equation must satisfy the relativity principle, 
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that is, preserve its form in transformations to other inertial reference 
brames. 

As applied to Einstein’s relativity principle, it is possible to 
write equations in a way that reveals both the necessary properties 
of invariance simultaneously. Such notation is known as relativistic 
invariant notation. 

First we give a somewhat different form to the Lorentz transfor¬ 
mations, introducing the notation 

i_L = t ani|> (13.31) 

so that 


( i v * \ 1/2 _ 1 

\ c 2 / cos y\) 

We further introduce the imaginary coordinate x 4 

x 4 = ict (13.32) 

Then transformations (13.17) and (13.18) take the form of a rotation 
of the coordinate system through the angle \|>: 

x\ = x i cos \|3 + x 4 sin (13.33) 

x[ = — x t sin ij? + x k cos ij? (13.34) 

This manipulation involves no additional physical assumptions 
in comparison with those made in developing the set of transforma¬ 
tions (13.17) and (13.18). As will now be shown, the improvements 
were made only in the notation. The imaginary unit is needed to 
achieve complete formal similarity with conventional rotation of 
coordinates. 

Any rotation of coordinates, including conventional (spatial) 
rotation, can be pictured as a set of separate rotations in which only 
two of the total number of coordinates are transformed. In particu¬ 
lar, if a fourth coordinate is introduced, then any rotation in four- 
dimensional space is effected by the Lorentz transformations in the 
form (13.33) and (13.34) and by additional spatial rotations of the 
coordinates through real angles. We did not make time an imaginary 
coordinate, but by multiplying it by an imaginary unit we have 
been able to transform the set of three spatial coordinates and time 
as a single four-dimensional manifold of Cartesian coordinates. 

It is natural here to introduce four-dimensional scalars, vectors, 
and tensors. For example, the interval immediately displays its 
scalarly invariant nature. Introducing into it c dt = dxji, we obtain 

ds 2 = c 2 dt 2 — dx\ —dx\ —dx—dx k dx h = — ( dx k ) 2 (13.35) 

Here the dummy subscript fc takes on values from 1 to 4, as should 
be in a scalar expression. Unlike the indices in three-dimensional 
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space, which in tensor summation assume values from 1 to 3 and 
are denoted by Greek letters, we shall denote four-dimensional space 
indices by Latin letters. 

The known definition of a vector remains: it is a collection of four 
quantities which transform like coordinates. We shall encounter 
such quantities in relativistic mechanics (Section 14) and electro¬ 
dynamics (Section 15). 

All the main results referring to tensors in three-dimensional 
space are applicable to four-dimensional space, with the exception 
of one more or less “fortuitous” one, the vector product written in 
three-dimensional space as 

(A X ®)a = £<x$yA$By 

Having to deal essentially with an antisymmetric tensor of rank 2, 
ApB y — A y Bp, we reduced it to a vector, since the number of com¬ 
ponents of the one and the other in three-dimensional space is the 
same. In four-dimensional space the antisymmetric tensor Ai k has 
six components, A 12 , A 13 , A 14 , A 23 , A 24 , A 34 , while a vector has 
only four, and there can be no correspondence between them. That 
is why the operation of vector multiplication, wherein lies the basic 
meaning of vector analysis, does not occur in four-dimensional space. 
Here, tensor notation is doubtlessly more convenient. 

We shall prove the relativistic invariance of an equation by 
reducing it to four-dimensional tensor form. This form of notation, 
in turn, makes it possible to specify in advance the equations that 
agree with relativity theory, and thus it substantially reduces the 
number of possible physical assumptions in the search for new laws 
and regularities. 


EXERCISES 


1. Calculate the change in the velocity of light propagating through 
flowing water in the Fizeau experiment. 

Solution. 


u ± = 


c/n + V ~ / c 
1 ± VInc ~ \ rc 





Disregarding the theory of relativity, the result would b e u± = c/n + V. 


2. Obtain a precise equation for the aberration of light, for an arbi¬ 
trary inclination of the ray to the ecliptic. 

A nswer. 


cos d' 


cos §—Vjc 
1 — Vjc cos d 
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3. Write the equations for the Lorentz transformations for an arbi¬ 
trary direction of the velocity V relative to a coordinate system. 

Solution. In our equations a; = rV/F and x = r'V/F. The radius-vector 
component perpendicular to the velocity is 

, V(r-V) _,, V ( r ' • V) 

r y2 T y2 


From (13.17) 

r'V rV/F — Vt 
V “ (l-F 2 /c 2 )V 2 

Multiplying this equation by V/F and adding the equations, we obtain 


r' = r 


V(r-V) 

F 2 


( V(r-V) 
\ F 2 



-i/2 


4. Show that a four-dimensional element of volume, dx 1 dx 2 dx 3 dx A , 
is invariant under the Lorentz transformations. 

Solution. Represent the volume element in tensor notation: 

d<*)T = Eikim dx { V dx ^ 2) dx[ Z) dx<£ 


where only the ith component of a vector, dx is other than zero. 

We may also write the array of transformation coefficients correspond¬ 
ing to the transformation from the unprimed to the primed reference frame 
according to (13.17) and (13.18): 


--I- vy 

V 1 — F 2 /c 2 

0 1 0 

0 0 1 

— F 0 0 


— F/c 2 
V 1 — F 2 /c a 
0 
0 
1 

Y 1 — F 2 /c 2 ^ 


The Jacobian of this transformation is unity. A transformation of more 
general form is associated with subsequent rotations in three-dimensional 
space which do not affect the corresponding three-dimensional volume ele¬ 
ment d< 3 )t. Hence, d < 4 >t is always invariant. 

5. Find how the Lorentz transformations alter a three-dimensional 
volume element. 

Solution. The Jacobian comprising the first three rows and columns of 
the transformation array is equal to (1 — FVc 2 )' 1 / 2 , whence 


d(3) T ' = d ( 3)T 


1 

(1 -F 2 /c 2 )V 2 


This can also be demonstrated by applying the length contraction formula 
(13.22) to the volume element in the direction parallel to the velocity. If f 
from (13.21) is substituted, the product d( s >Tdt again yields an invariant. 
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RELATIVISTIC MECHANICS 


The concept of relativistic invariance will enable us to obtain the 
expression for the action of a free mass point. Much of what was said 
concerning the action function in Newtonian mechanics remains 
valid in relativity theory; specifically action must not involve the 
coordinates and time explicitly under the integral sign, and it cannot 
be dependent upon the direction of the particle’s velocity. It must 
also satisfy the requirement of the relativity principle that its 
form must not change under transformations to other inertial frames. 
It was pointed out in Section 13 that when the equations are 
written in a relativistic invariant form, the invariance condition 
relative to spatial rotations is satisfied together with Einstein’s 
relativity principle. 

Another necessary requirement which action must satisfy in rela¬ 
tivity theory is that, at small velocities, all expressions become the 
corresponding expressions of Newtonian mechanics. 


The Lagrangian of a Free Particle. In Section 13 we obtained an 
infinitesimal of the first order with respect to the increment of all 
spatial variables of a particle and to the time increment. This is 
the infinitesimal interval ds, which also satisfies the requirement of 
relativistic invariance. No other such quantity can be developed. 
We shall therefore look for the action of a free particle in the form 

S = a j ds (14.1) 

We now pass from the action to the Lagrangian. For this we repre¬ 
sent the infinitesimal interval as follows: 


ds = (c 2 dt 2 - dl 2 Y'* = c dt [ 1 — (1 -J-) 2 ] 1/2 

= cdt(l — v 2 /c 2 )V 2 C (14.2) 

Hence, the Lagrangian, which is identically determined by S = 
■ ^ L dt, is 

L = ac(i-v 2 /c 2 ) 1 ' 2 (14.3) 


We determine the coefficient a from the condition that, at small 
velocities v, the function L turns into the nonrelativistic expression 
of the action of a free particle. Since 


l > 2 \ 1/2 


(*-■?) 


1_ — 

2c 2 


12-0452 
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for v c we rewrite L in the form 

t l a v 2 \ 1/2 cav 2 

£ = «c(1-7t) =«c- gr- 

Since the first term is constant, it can be omitted, while the second 
term should reduce to mv 2 !2 (see Sec. 2). Comparing with (2.26), 
we obtain 

a = —me (14.4) 

According to the meaning, here the mass m of the particle is defined 
in its proper reference frame. In future this is the only definition 
of mass we shall use. Since the proper reference frame is stated unique¬ 
ly, the quantity m is relativistically invariant and characterizes 
the particle. Finally, we have the Lagrangian in the form 

L=_ rac 2(i_. £) 1/2 (14.5) 


Momentum in Relativistic Mechanics. The expression for momen¬ 
tum in relativity theory is obtained directly from (14.5): 


_ dL _ my 

P d\ (1 — v 2 /c 2 )V 2 


(14.6) 


At small velocities it reduces, as it should, to the expression for 
momentum in Newtonian mechanics, p = m\. 

In some books the quantity m( 1 — i; 2 /c 2 ) _1/2 , that is, the propor¬ 
tionality factor between velocity and momentum, is called the 
“mass of motion”, as distinct from the rest mass , m. To avoid confu¬ 
sion, we shall not use the term “mass of motion”, and will always 
take the term “mass” to mean the relativistically invariant quan¬ 
tity m. 

The limiting nature of the velocity of light, mentioned before 
in Section 13, can be seen from Eq. (14.6). As a particle’s velocity 
tends to the speed of light its momentum tends to infinity. 

The only possible exception is a particle of mass zero. The momen¬ 
tum of such a particle written in the form (14.6) yields, for v = c, 
an indeterminacy of the form 0/0 and may remain finite. But then 
the velocity of such a particle must be equal to c, since otherwise 
its momentum vanishes identically, making it incapable of inter¬ 
acting with any mechanical system, that is, it would in no way 
display its physical existence. 

As we know, the velocity c is relativistically invariant, so that 
the property of a given particle to travel with the speed of light 
is inherent in it, not in the reference frame in which its motion is 
described. The momentum of such a particle must be stated not 
according to Eq. (14.6) but independently of the magnitude of its 
velocity, which is always the same and equal to c. 
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Velocities greater than c are physically meaningless, since they 
would be associated with imaginary values of the momentum. Par¬ 
ticles travelling faster than light would be moving faster than inter¬ 
actions transmitted between them. The absurdity of such a situa¬ 
tion can be readily appreciated: the causality principle would break 
down. The impossibility of velocities exceeding the speed of light 
stems from the fact that the relativity principle does not violate 
causality, which links objective events. And in Section 13 it was 
already pointed out that such a concept as the order of cause and 
effect does not depend on the choice of reference frame. 


Energy in the Theory of Relativity. We proceed from the general 
definition of energy (4.1): 

dL T mv2 'mc 2 (l — i^/c 2 ) 1 / 2 


E = v-%- — L 
d\ 


(1 — i;2/ c 2)l/2 


(1 — l ? 2 / c 2 ) 1 / 2 


(14.7) 


This equation reaffirms the limiting nature of the velocity of light. 
When v -+c, the energy of particle tends to infinity. In other words, 
infinite work must be done to accelerate a particle to the speed of 
light. An exception are particles whose mass is zero, and at v = c 
they have finite momentum and finite energy. 

From Eq. (14.7), the energy of a particle at rest is equal to me 2 . 
Let us apply this formula to a compound particle capable of spon¬ 
taneously decaying into two new particles, for example, a nucleus 
decaying into a daughter nucleus and an alpha-particle. Since the 
decay is spontaneous, it is due not to an external action on the parent 
nucleus but to certain specific features of its internal motion. Con¬ 
sequently, radioactive decay is a process that takes place in a closed 
system, and the total energy is therefore conserved. The energy 
of the paternal particle prior to disintegration is equal to the sum 
of the energies of the product nucleus and the a-particle after disin¬ 
tegration, when they no longer interact. 

The energy of each particle is expressed according to equation (14.7), 
which is applicable to any particle (simple or compound) whose 
motion is considered as a whole. The only possible form of the La- 
grangian for such motion is (14.5), whence it follows that the energy 
is represented in the form (14.7). Assuming now that the decaying 
particle was at rest, we write the expression for the energy-conser¬ 
vation law in the decay process, using (14.7): 


me* 






(1 — f|/c a )l/ 2 ‘ (1 — i; j/c 2 )!/ 2 


(14.8) 


Both terms in the right-hand side, E l and E 2j are respectively 
greater than m x c 2 and m 2 c 2 , whence we obtain the important 

12 * 
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inequality 

m > m x + m 2 


(14.9) 


Thus, the mass of a compound particle capable of spontaneous 
decay is greater than the sum of the masses of the particles created 
as a result of the disintegration. This is an essentially new fact of 
relativistic mechanics as compared with Newtonian mechanics, 
where the law of additivity of mass holds. If we define the difference 


T 


me 2 

(1 — v*/c*)V* 


me 2 


(14.10) 


as the kinetic energy of a particle (for small energies it reduces 
to mv 2 /2) and call me 2 the rest energy , then it can be seen from the law 
of conservation of energy (14.8) that part of the rest energy of a com¬ 
pound particle is converted into the kinetic energy of the component 
particles, and part is converted into their rest energy. Only the total 
energies E, and not the kinetic energies 7\ satisfy the conservation 
law, because the kinetic energy of the parent particle as a whole 
is equal to zero before disintegration and cannot be equal to the 
essentially positive kinetic energy of the decay products. 

In chemical reactions, the change in the rest masses of the reacting 
substances occurs in the order of 10 -9 (and less) of the total mass. 
In nuclear reactions, where the particle velocities are of the order cl 10, 
the change in mass may approach one-half of one per cent. When 
an electron and positron (a positive electron) are annihilated, their 
energy, including rest energy, is totally converted into the energy 
of electromagnetic radiation. 

From the quantum theory (see Part III) it is known that radiation 
propagates through space in the form of separate particles, called 
light quanta. This is not only compatible with the wave properties 
of radiation but derives directly from them. The velocity of a light 
quantum is c, so that its mass is identically equal to zero. The total 
rest mass of the particles taking part in an annihilation process is 
equal to 2me 2 before annihilation, and to zero afterwards. The increase 
in the energy of the electromagnetic field is, of course, not less 
than 2mc 2 . The least value (2 me 2 ) is obtained when an electron and 
positron annihilate at rest and have no additional kinetic energy. 
We could call the energy of an electromagnetic field divided by c a 
its “mass”. With such a definition of mass, the total “mass” would 
be conserved. But such a “mass” conservation contains nothing new 
in comparison with the law of conservation of energy. Dividing the 
equation expressing the latter by c 2 yields no essentially new law; 
all that occurs is a transformation to other measurement units. 

It is precisely the rest mass that is best used in determining the 
energy balance in nuclear reactions, for a change in the rest mass 
of all the particles involved in the transmutation determines the 
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energy which may be generated as a result of the reaction in the form 
of the kinetic energy of the disintegration products or radiation 
energy. There is no sense in calling the energy of a light quantum 
divided by the square of the velocity of light its mass, because this 
quantity does not in any way characterize light quanta. Energy 
has one value in one reference system and an entirely different value 
in another, whereas mass is a quantity that characterizes a particle. 
For example, the mass of an electron (the rest mass) is equal to 
9 X 10 -28 g, while the corresponding quantity for a quantum is iden¬ 
tically equal to zero. But this zero characterizes a light quantum 
to no less an extent than 9 X 10 -28 g characterizes an electron. 

The mass of a particle determines the relationship between mo¬ 
mentum and velocity according to Eq. (14.6). It is impossible to 
calculate a particle’s mass from its momentum alone, because par¬ 
ticles with the same momenta may have quite different masses. 
That is why the assertion occasionally made that the fact that elec¬ 
tromagnetic field possesses momentum, which is manifested in the 
form of light pressure, is proof that light quanta possess mass is 
quite meaningless. Mass cannot be determined from momentum or 
energy separately: it is involved only in the relationship between 
the two quantities for the given particle and thereby characterizes 
that particle. 

Also erroneous is the widespread assertion that a mass of one 
gram is capable of evolving 9 X 10 20 ergs of energy. For that, one- 
half of the mass would have to be antimatter and annihilate together 
with the matter. Changes in mass in pure matter (or pure antimatter) 
occur in nuclear reactions, where the total number of protons and 
neutrons does not change. Therefore, the change in mass amounts 
to no more than fractions of a percentage point. 

We shall now express energy in terms of momentum. Squaring 
Eq. (14.7) and subtracting from it Eq. (14.6), after the latter has 
also been squared and multiplied by c 2 , we obtain 

E 2 — c 2 p 2 = m 2 c 4 (14.11) 

In Section 10 we have called the energy expressed in terms of 
momentum the Hamiltonian. Hence 

E = SS = (mV + c 2 p 2) 1 / 2 (14.12) 

Whence we obtain a relationship between the energy and momen¬ 
tum of a particle that has no mass: 

E = cp (14.13) 

This is the form to which expression (14.12) tends when the momen 
turn tends to infinity. 

The Lorentz Transformations for Momentum and Energy. To keep 
track of the relativistic invariance of equations it is convenient to 
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write them in four-dimensional tensor notation. Let us first represent 
momentum and energy in such notation. 

We substitute dt/ds = c _1 (1 — v 2 lc 2 )~ l l 2 into the definition of 
momentum (14.6). Its components then take the form 

p a = mc igL. (14.14a) 

where the Greek index a assumes values from 1 to 3, as agreed. 

In the definition of energy (14.7), velocity must be written as 
dlldt, and dt replaced by dxj(ic). Comparison with (14.14a) then 
shows that the imaginary quantity iE/c should be treated as the 
fourth component of the momentum vector: 


P*= mcJ jt (14-146) 

Together, the momentum and energy of a particle form a single four- 
vector with components 

Pi = mc -^r (14.15) 


But it was shown in Section 13 that from the mathematical point 
of view a Lorentz transformation represents a rotation of a reference 
frame through an imaginary angle the tangent of which is equal 
to iV/c. By definition, every vector transforms as a radius vector. 
Hence, in passing to another inertial reference frame the components 
of a four-momentum must transform according to Eqs. (13.33) and 
(13.34): 

Pi = Pi cos + Pk sin 

p\ = — p i sin if + p 4 cos if (14.16) 


In order to return to conventional three-dimensional notation we 
substitute tan if = iV/c and p k = iE/c. We then find the required 
momentum and energy transformation formulas: 


, _ p x — EVlc 2 

Px— ( 1 _y 2/c 2)1/2 


(14.17) 


E — Vp x 

(l_y2/ c 2)l/2 


(14.18) 


If the relative velocity of the reference frames is directed along 
the x axis, p y and p z do not change. 

Note that a correct limiting process from (14.17) to the nonrelativ- 
istic momentum transformation formula p' x = Px — m V is obtained 
only if in place of E we substitute the rest energy me 2 . The nonrela- 
tivistic equation corresponds to a simple addition of velocities: 
Vx = V x — V. 
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Hence, if we demand that the Lorentz transformations yield the 
correct limiting transition to the Galilean transformations, it is 
necessary to include the rest energy of the particles in their total 
energy. Conversely, the kinetic energy T from (14.10) does not give 
a correct limiting transition. 

In relativistically invariant form, the relationship (14.12) between 
energy and momentum is written as follows: 

PiPi = pi = — m 2 c 2 (14.19) 

It is based on the fact that the square of the four-dimensional 
velocity u t = dxjds is equal to —1: 

dx i dx i i 

wi=- s? -=-i 

The Velocity of a System of Noninteracting Particles in Relativity 
Theory. We shall now show how to determine the velocity of a system 
of noninteracting particles in the theory of relativity. The difference 
from Newtonian mechanics is that the description of an interaction 
requires the inclusion of the field in the system, and the particles 
can no longer be treated by themselves as a closed system, even if 
there are no external fields acting on them. 

For simplicity’s sake we consider two particles. Between the 
velocity, energy, and momentum of each particle there exists the 
relation 

P = J 5 1 (14.20) 

which is derived from (14.6) and (14.7). The same equation can also 
be obtained somewhat differently. From (14.17) we determine the 
velocity of the reference frame in which the particle’s momentum 
is zero. Putting p' x = 0 in the left-hand side of (14.17), we have 
in the right-hand side V = p^lE, or, if the velocity is not directed 
along the x axis, in general V = pc 2 /E = v, in agreement with (14.20). 
Applied to one particle, the equality V = v is trivial and denotes 
simply that the momentum of a particle relative to a reference frame 
moving with the same velocity is zero. 

We now apply (14.20) to the two particles so as to find the veloc¬ 
ity of the reference frame relative to which their total momentum 
is zero. Since the particles do not interact, their total momentum 
and total energy are simply added; otherwise we would have to take 
into account the momentum and energy of their interaction field 
(in the case of nuclear forces we simply dont’t know how to do this). 

And so, the total momentum of the particles is Pi + P 2 = p, 
and their total energy is E x -f- E 2 = E. We direct the x axis along p. 
Since the Lorentz transformations are linear and homogeneous, the 
transformation formulas to another inertial frame for a sum of two 
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four-vectors look the same as for each of them separately. Hence r 
the required velocity of the reference frame in which the total momen¬ 
tum is zero is given as 

y_ c 2 (Pi + P2) 

E t + E 2 

To obtain the limiting transition to Newtonian mechanics from 
this we must put p 1 = m 1 \ 1 , p 2 = m 2 \ 2 , E 1 — m x c 2 , E 2 = m 2 c 2 . 
Then Eq. (14.21) coincides with the conventional expression for 
the velocity of the centre of mass cf a system of particles. The quan¬ 
tity V, expressed in terms of the particles’ velocities according 
to (14.21), does not have the form of a total time derivative of any 
coordinate, hence it is impossible to determine the coordinates of 
the centre of mass from its velocity. 

In relativistic mechanics the relative velocity v x — v 2 is meaning¬ 
less, because there is no simple law of velocity addition. 

Action for a Charged Particle in an External Electromagnetic Field. 
Let us now consider the equations of motion of a charged particle 
in an external electromagnetic field. As in the case of a free particle, 
we shall proceed from the expression for action, requiring that it 
be relativistically invariant. 

Of course, one could write many possible expressions that would 
be relativistically invariant in form. We find, however, that one 
of the simplest expressions is in agreement with experience. By 
“experience” iwe mean a totality of facts at least as great as the 
foundation on which Newtonian mechanics rests. 

The action of a particle in an electromagnetic field includes the 
action of a free particle and a supplementary term describing the in¬ 
teraction of the electromagnetic field and the charge; in relativ¬ 
istically invariant notation it looks like this: 

^ ^— mcds-\-^ A k dx h j (14.22) 

where A k is meant to denote a four-vector. Its three spatial compo¬ 
nents yield the known vector potential of an electromagnetic field 
derived in Section 12. (This will be shown in the following section 
in developing Maxwell’s equations from the relativistically invariant 
action of an electromagnetic field.) The fourth component, A 4 , is icp, 
where (p is the scalar potential also developed in Section 12. The 
constant e is called the charge of the particle. By definition it is 
a relativistically invariant quantity. 

We now take dt outside the parentheses in the expression for 
action. Then, by definition, in the parentheses we have the Lagran- 
gian: 

S = j P — me 2 (1 — ^ j 1/2 + y Av — ecp J dt = j Ldt (14.23) 


(14.21) 
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From it we obtain, according to the conventional rules, the expres¬ 
sions for momentum and energy. For momentum we have 


P = 


dL 

dw 


mv 

(1 — z ^/ c 2 ) 1 / 2 


-A = Po + -A 

C r C 


(14.24) 


Here, p 0 denotes momentum in the absence of a field. 

From (4.1), the energy is 

E = v — L = vp 0 + y vA + me 2 (1 — v 2 /c 2 ) i/2 — y vA + ecp 

= E 0 + eq> (14.25) 


where E 0 is the energy in the absence of an external field; according 
to (14.7), it is equal to 

E o = vp 0 + me 2 (1 — y 2 /c 2 )*/ 2 = (1 _^ )t/1 

Thus, the term linear with respect to the velocity does not appear 
in the energy expressed in terms of the velocity. It will be seen hero 
that the Lagrangian is not of the form T — U because it involves 
a linear term. 

From (14.24) and (14.25) we obtain 
Po = P—“A, E 0 = E — ecp 


But we already know the expression for E 0 in terms of p 0 from 
Eq. (14.12), which relates to the energy and momentum of a free- 
particle. Substituting these quantities, expressed in terms of p and E, 
into it, we obtain the Hamiltonian of a charge in an external 
electromagnetic field: 

SS = | m 2 c k -\-c 2 (p — A^ 2 J 1/2 + ecp (14.26) 

In this case we can also write by analogy with the four-dimensional 
notation (14.19): 

(Pt~T A t) (Pt-T A i) “ (Pi~T A 'Y = ~ m2c2 < 14 - 27 > 


Equations of Motion of a Charge in an External Field. Knowing 
the Lagrangian from (14.23), we write the Lagrange equations for 
a particle moving in an external electromagnetic field. As always, 
in the most general form they must be 

d dL dL ^ 

dt d\ dr 

where one vector equation replaces three equations expressed in 
terms of the components. 
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The derivative dL/d\ is equal to p = p 0 + ( e/c)A , so that the 
total time derivative of p is 

d dL dp~ Jp 0 | e dA 

dt dv ~~ dt dt ' c dt 


In order to expand the expression dA/dt, we first write it for one 
component: 


dA x 

dt 


dA x , dA x dx , dA x dy , dA x dz _ dA x , ^ A 


From this, using the notation (vy) (cf. (11.31)) for all three com¬ 
ponents of A, we rewrite dp/dt as follows: 


dp 

IT 


dpo . e / dA_ 
dt ' c \ dt 


+ (v-V)A) 


We now compute the derivative dL/di, or, what is the same, 
grad L : 

grad L=-^- V(A • v) — e grad (p 


The gradient v(A-v) denotes differentiation with respect to coor¬ 
dinates, on which only A depends explicitly, but not v. Therefore, 
applying Eq. (11.32), we reduce grad L to the form 

T J £ £ 

— = grad L = — (vV)A-f-(v X curl A) — egrad cp 


Substituting dp/dt and grad L into the general Lagrange equation 
and retaining only dp 0 jdt in the left-hand side, we obtain the required 
equation of motion of a charged particle in an electromagnetic 
field: 

dpo d m\ 

dt dt (1 — i^/c 2 ) 1 / 2 

= <?(— gradcp — y-|^-) + ^-vXcurlA (14.28) 

Recalling Eqs. (12.34) and (12.35), which relate electromagnetic 
fields and potentials, we rewrite (14.28) in final form: 


T „^;.) - = «E + fvXH (J4.29) 

Equation (14.29) should be seen as the physical definition of an 
electromagnetic field according to its action on a charged particle. 
In accordance with the requirement of gauge invariance, the scalar 
and vector potentials taken by themselves are not involved in equa¬ 
tions expressing physical quantities. 

It would be extremely difficult, however, to get along without 
electromagnetic potentials altogether; notably, it would be impos- 
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sible to introduce the Lagrangian. We shall show that it satisfies 
the condition imposed on the potentials. Indeed, if the potentials 
are subjected to the gauge transformation (12.36) and (12.37), that 
is, if the substitution 

A-A' + v/, 


is made, we obtain under the integral in (14.23) the term 


e 


V 

c 


v/+4 


dt 


e_ di 
c dt 


But the total derivative of a function of the coordinates and time 
can always be omitted from a Lagrangian. Accordingly, the equa¬ 
tions of motion involve fields, but not potentials. 

The vector in the right-hand side of (14.29) is called the Lorentz 
force. It involves, besides the conventional electric force eE known 
from electrostatics, the term (< e/c)v X H, somewhat resembling the 
Coriolis force. This term is due to the part of the Lagrangian that is 
linear with respect to velocity. 

The magnetic part of the Lorentz force, (e/c)v X H, is very like 
the expression for a force acting on a current in an external magnetic 
field and can be derived from it in much the same way as, in Sec¬ 
tion 12, we obtained Maxwell’s equations from the fundamental 
laws of the theory of electromagnetism. However, using this method 
it is much more difficult to discern the relativistic invariance of 
the expressions. 


The Equations of Motion of a Charge in Four-Dimensional Form. 
We shall now show that in developing Eq. (14.29) from the inva¬ 
riance principle (14.22) we did not lose the relativistic invariance 
of the result, even though it would appear that time was factored 
out in writing (14.23) and is involved in the relationship nonsym- 
metrically with the coordinates. In particular, unlike the coordi¬ 
nates, it does not vary. 

We start with the definition of an electromagnetic field in four- 
dimensional notation. The components of a magnetic field H are 
connected with the vector potential components in the following 
way: 

rr _ dA y dA z J J dA x dA z JJ __ dA y dA x 

n x- dz dy ’ 'J~ dz dx ’ 2 ~ dx dy 

Hence, with the help of the components of the four-vector, 
involved in (14.22), the magnetic field can be represented as follows: 

rr _ dAs dA 2 jr _ dA^ dA% ,j _ dA 2 dA\ 

x dx 2 dx% ’ y d.r 3 dx i ’ z dx i dx 2 
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We now introduce a four-dimensional antisymmetric tensor F ihr 
connected with the vector and scalar potentials (or with the four- 
vector potential) by the equations 


F ih = 


dA\ 

dxi 


dAi 

dx k 


(14.30) 


Substituting A 4 = i<p, and x 4 = ict into this, we obtain the com¬ 
ponents of tensor F ik involving 4 as one of the indices: 

p _ dA 4 dA\ _ . d<p 1 dA x 

14 dx^ dx^ 1 dx ic dt 

= j_ / _ dy 1 &A X \ _ E x 

i \ dx c dt ) i 



We finally write the whole tensor in matrix form: 


i 0 H z —Hy —iE x -\ 


^11 P12 P 13 P lk 

1 —H x 0 H x -iE y 


P 21 P 22 P 2Z P 24 

1 

o 

1 

w. 

ha 

N 


^31 Pz2 P33 ^34 

L iE x iE y iE z 0 j 


^41 ^42 P 43 Pkk 


(14.31) 


The equality of the matrices signifies term-by-term equality of 
the components. Thus, in four-dimensional notation electric and 
magnetic fields constitute a single antisymmetric tensor. By defi¬ 
nition, it has zeros along its principal diagonal, while the compo¬ 
nents symmetrically located with respect to the principal diagonal 
have opposite signs. 

Now we multiply the first equation in (14.29) by dt/ds and expand 
its right-hand side in the components of the magnetic field: 


dPx 

ds 


= eE, 


dt 

ds 


, e dy j-j _ e dz „ 

c ds rJz c ds y 


Substituting p x from (14.14), and the electromagnetic field compo¬ 
nents from (14.31), we arrive at the equation 


me 


d*x 4 


ds 2 


6 r, dXfo 

T b *-jr 


The notation is similar for the other two equations. 

Let us combine the obtained set in one four-dimensional equation. 
What is the meaning of the fourth equation? Since there were only 
three relationships in (14.29), we must show that the fourth is a co¬ 
rollary of them. 
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For this, multiply (14.29) by v. In the left-hand side we obtain 
dpp _ dE 0 dp 0 __ dE 0 
dt dp 0 dt dt 

In the right-hand side the product of the velocity times the com¬ 
ponent of the Lorentz force vanishes, because v X H and v are 
mutually perpendicular. There remains the equation 

^ = e(E-v) (14.32) 

Substituting E 0 = ( mc 2 /i)(dxjds) and the electric field components 
from (14.31) into it, we observe (after multiplying by dt/ds) that 
we really obtain the fourth equation for a four-dimensional set: 

In the left-hand side of Eq. (14.32) we have the change in the 
particle’s energy in unit time, that is, the work done on it in that 
time. On the right-hand side we have the scalar product of the force 
acting on the particle multiplied by its velocity: the conventional 
expression of work per unit time. This quantity is known as the 
Lorentz work. 

We have thus obtained a relativistically invariant formulation 
of the equations of mechanics of a charged particle in an electro¬ 
magnetic field. The essential physical distinction from the equations 
of Newtonian mechanics is that in the case of an electromagnetic 
field we cannot introduce the concept of the interaction energy of the 
particles in the system. Each particle interacts directly only with 
the electromagnetic field, according to the fact that in relativity 
theory only short-range action is possible. 

Equations (14.29) or (14.33) can be solved only if the external 
electromagnetic field acting on the particle is known. To have a com¬ 
plete set of electrodynamic equations we must learn, in turn, to find 
the field from the given motion of the charges. In principle, the 
problem is solved with the help of Maxwell’s equations obtained 
in Section 12. A relativistically invariant derivation will be present¬ 
ed in the next section. In Section 20 we shall point out certain 
difficulties due to the application of the simultaneous set of Maxwell’s 
equations and (14.33). 


EXERCISES 

1. A fast proton possessing energy E collides with a stationary proton. 
Determine the portion of the energy of the colliding proton that can be 
dissipated in an inelastic process (for example, the creation of a proton- 
antiproton pair). 
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Solution. The energy dissipation of the colliding particles in an inelastic 
process is greatest in the reference frame in which the total momentum of 
both particles is zero (see Sec. 6). Let the momentum of one particle in this 
system be p 0 , of the other —p 0 , and the energy of each E 0 = (mV + c 2 /^) 1 / 2 . 
We proceed from the invariance of the four-dimensional product p? p}?* = 
= pohpifh. Since in the laboratory reference frame the momentum of the 
stationary proton is zero while its energy is me 2 , we obtain 

mc 2 E = E 2 -\-c 2 pl 

From this it follows that, in a centre-of-mass reference frame the re¬ 
quired total energy is equal to 

4-[2mc2 (£ + mc2)]V2 

£j 

Thus, the relative proportion of the energy of the colliding proton that 
may be dissipated in an inelastic process is inversely proportional to the 
square root of its initial energy. That is why at present such attention is 
being given to the development of accelerators with colliding beams, in 
which the laboratory reference system and centre-of-mass reference coincide. 

2. Consider the collision of a particle of zero mass with a stationary 
particle of mass m. Determine the energy of the incident particle after the 
collision, if its energy prior to the collision, E, and the deflection angle, ft, 
are known. 

A nswer • 


1 -f (E/mc 2 ) (1 — cos ft) 

3. Find the dependence of the velocity of a rocket upon the mass of 
the burned and ejected fuel, if its initial mass M 0 and the velocity v of the 
ejected particles of the fuel relative to the rocket are known. 

Solution . Let the sought velocity of the rocket be V , its initial mass, 
M 0 , and its instantaneous mass, M. If a certain quantity dm of the fuel is 
ejected, then, relative to a fixed reference frame, the momentum conserva¬ 
tion law is written as follows: 

MV _ v' dm 

d (1 — F2/ c 2)i/2 — (l_ i; '2/ c 2)i/2 

Here the velocity u' of the ejected fuel relative to the fixed reference frame is 
v-V 

v 1 — vV/c* 

This takes account of the fact that v and V are oppositely directed. 
Substituting v' into the momentum-conservation equation, we obtain 

MV (u — V) dm 

d (1 — F 2 /c 2 )i/ 2 “ (1 — * 2 /c a )*/ 2 (l — V 2 /c 2 )V 2 
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The energy-conservation law is more conveniently written in a reference 
frame fixed with respect to the rocket: 

_ — _ 

(1 — w»/c»)t/* 

After eliminating dm and carrying out the necessary cancellations, we 

have 


dM _ dV 

M V (1 — V 2 /c 2 ) 


whence, after integrating, we arrive at the required equation: 
M (1 — V/c) c/2v 

Mo (1 + V/cf 2v 


If total annihilation of matter occurs in the rocket, and the ejected 
particles are photons (light quanta), then 
M ( 1 — VJc \ 1/2 
M 0 -\ 1 + V/c ) 

In the nonrelativistic limit, when v <C c, 


4. A stationary particle of mass m decays into two particles of mass m* 
and m 2 . Determine the energies of the end products. 

A nswer. 


E 1 = 


m\ — m\ -f- m 2 
2m 


e 2 = 


ml — m\ -f- m 2 
2m 


5. Develop the equations of motion of a charge in an electromagnetic 
field directly from the invariant expression for action (14.22). 

Solution. The variation of action has the form 

65 = j (—mc6rfs+y-^7-6xi<Ja: fc +-i ^ fc d6x h )=0 

Taking advantage of the fact that ds 2 = — dx ?, we find the variation of the 
interval 

• j dxi d&xi 

o as = -:- 

ds 

Transforming the variation differentials by parts, we obtain 

6S = j (~ mcd ~dT' Jr T~d^' dxh ~~~d^t dxh ) dxi 

Equating the factors of the variations to zero, we arrive at the set of equa¬ 
tions (14.33): 

d?Xi _ e I dAk dAi \ dx & _ e dx & 

ds 2 — c \ dxi dxk ) ds — c lh ds 


me 
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6. Find the scalar and vector potentials of a freely moving charge. 

Solution. In its reference frame, the charge creates only an electro¬ 
static field, the scalar potential of which is equal to e/r 0 . There is no 
magnetic field in this frame, so that the vector potential is equal to zero. 
Transforming the scalar potential as the fourth component of the vector 
according to the general equations (13.33) and (13.34), and taking into 
account that A x in the right-hand sides of these equations vanishes, we 
obtain 

q, =_ 22 __ = _f_ 

(1 — f 2 /c a ) 12 r 0 (1 — u 2 /c a ) 1/2 

A <Pqv/c __£V_ 

(1 — 1/2 TqC (1 —u 2 /c 2 ) 1/2 


Furthermore, we must express r 0 in terms of the coordinates in the 
system relative to which the charge is moving: 


n) = (zo+0#+*o) 1/2 = ( 


(x — Ut) 2 
1 — v 2 /c 2 



i/2 


Instead of ut we may substitute £, that is, the abscissa of the moving 
charge. The electromagnetic disturbance arrives at time t at the point with 
coordinates x, y, z, not from point (£, 0, 0), where it is located at the 
given instant, but from point (£', 0, 0), where it was located at the time 
of the emission of the disturbance. If the distance from point (£', 0, 0) 
to the point with coordinates x, y, z is R\ the time required for the electro¬ 
magnetic disturbance to travel along it is R'/c. The charge moving with 
the velocity v would have taken the same time to travel the path (£ — g')/p # 
From this we have the equation 

E-S' _ [(*-r) 2 +i / 8 +* a ] 1/2 . r 

v c e 


Here, 5 = vt , that is, the charge’s abscissa at the current time t. The differ¬ 
ence x — s' is the projection of vector R' on the direction of the velocity, 
{v»R')/i;. Substituting the obtained expressions into r 0 , we finally obtain 


e 

<P= R' — vR'/c ’ 


A = 


ev 

c(R' — yR'/c) 


7. Find the motion of a charge in a constaiit uniform magnetic field. 
Solution. If the field is in the direction of the z axis, the equations of 
motion are of the following form: 


dp x ^ i tj t 

— = T-dT |H| ’ 


dPy 

dt 


e 

c 


dx 

dt 


I H |, 



Since the magnetic field does not do work on the charge, p 2 = constant, 
p z = constant, p% + p% = constant, and 

mv x Ev x 
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We look for the coordinates x and y in the form: 
x = r cos cot, y = r sin cot 


For 


r and co the following expressions result: 

Ev ec\B\ 

r ~ec IHI ’ E 


The particle moves along a helix. For small velocities, co reduces to 
the constant value *|H| I me. 

8. Find the motion of a charge in a constant uniform electric field# 
Solution. The equations of motion are 


dp x 

dt 


= e\E\ y 


dPy 

dt 


= 0 , 


dp z 

dt 


= 0 , 


dE o 
dt 




From the last equation we obtain 

Kc*+ c * 04+pj+ p *)] 1/2 _[ m » c 4+c* (p* 0 +p* 0 +p| 0 )] 1/2 = e I E I X 
From the first equation 

Px-P X0 = *\E\t, P y -P yo = 0, P z -P zo = 0 

These equations together give x as a function of t. 

If p Zo = 0, then dividing p x by p y we have an expression for dxldy 
in terms of x (by eliminating t from the energy integral). The trajectory is 
of the form of a catenary. 

9. Express the energy of motion of a charge in an attracting Coulomb 
field in terms of the adiabatic invariants (action variables). 

Solution . Since the potential has only a scalar component, we find from 
(14.27), after passing to plane motion in polar coordinates: 

-l-(£-«p)* = i»V+p»+-0. t /* = p* 

From this the radial action variable is 



The integral is taken over the whole domain in which the radicand is 
real, taking into account that E < me 2 , since otherwise the motion is infinite. 
The assumption E < me 2 corresponds to E < 0 in Newtonian mechanics, 
that is, to finite motion. 

Integrating, we find 

_ Ze'E / ZM \ 1/2 

r cKct-P ) 1 ' 2 \ * «• ) 


13 - 0*52 
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whence we express the energy in terms of the action variables J r and /„: 


E = mc 2 



_ZM_”1 —1/2 


The derivatives dEldJ r and 3EldJ v are not equal, so that the path of the 
charge is not closed. It has the shape of an ellipse whose axes are rotating 
(a rosette). 


15 


ACTION OF AN ELECTROMAGNETIC FIELD 

The Lorentz Transformations of Field Components. In this section 
it will be shown that Maxwell’s equations can be treated as mechan¬ 
ical equations of motion applied to electromagnetic fields. For 
this they must be developed from the general principle of mechanics, 
just as the equations of the mechanics of mass points were developed 
in Section 2 from the principle of least action. 

To formulate the corresponding principle for electromagnetic 
fields we must proceed from similar requirements: invariance of the 
equations with respect to spatial and temporal displacements of the 
coordinate system, its rotations, and transformations to other 
inertial frames of reference. 

The requirement of invariance with respect to spatial and tempo¬ 
ral displacements reduces to the requirement that the action of the 
field does not involve explicit functions of the coordinates or time. 
In other words, it can depend only on quantities describing the 
field, just as the action of a closed mechanical system depends only 
on its generalized coordinates and velocities. 

As for the condition of invariance with respect to spatial rota¬ 
tions and transformations to other inertial reference frames, in 
Section 13 it was shown that both requirements are best satisfied 
by writing the equations in four-dimensional tensor notation. 

We should therefore begin with the question of transforming the 
electromagnetic field components so as to see what invariant quan¬ 
tities can be obtained. 

As was shown in Section 14, a field is a four-dimensional anti¬ 
symmetric tensor of rank 2 the components of which are given by 
Eq. (14.31). Let us apply to it the ordinary Lorentz transformation 
for the case when the relative velocity of the reference frames is 
directed along the x axis. Such a transformation affects only the 
first and fourth tensor indices according to Eqs. (13.33) and (13.34). 




Electrodynamics 


195 


It c4n be seen at once from this that the magnetic field component 
along the relative velocity, H x = F 23 , does not transform at all: 
it has the tensor indices 2 and 3. 

Take the magnetic field component H z and the electric field 
component E y . In tensor notation these are F 12 and F 42 /£. The index 2 
is not affected by the transformation, while indices 4 and 1 are 
transformed according to the general rules (13.33) and (13.34). 
Thus we obtain the transformation equations of all four lateral 
electromagnetic field components: 


r. 

H z -VE y /c 

z 

(1 — V*/c*) l/2 

t* 

H v + VE t le 

V 

(1 — F*/c a ) 1/2 

, 

E y — VH z /c 

V = 

(1 — F 2 /c*) 1/2 

, 

E z + VH y /c 


(1 — F a / c *) 1/2 


(15.1a) 

(15.16) 


(15.2a) 

(15.26) 


There remains the longitudinal electric field component E x = 
for which both indices 1 and 4 transform. In this case the transfor¬ 
mation can be considered as a rotation in two-dimensional space 
with coordinates x x and x k . Then F u is an antisymmetric tensor 
of rank 2 in two-dimensional space. But it was shown in Section 11 
that a completely antisymmetric tensor whose rank is equal to the 
number of spatial dimensions is invariant under rotations. What 
was said of three-dimensional space can be literally transferred to 
any number of dimensions, including two. Hence, F u is invariant 
under rotations of the form (13.33) and (13.34). Of course, this can 
also be demonstrated with the help of computations according 
to the general formulas of tensor transformations. 

Thus, for both longitudinal field components we have 

H X =H X (15.1c) 

E X = E X (15.2c) 

Let us find the quantities invariant with respect to the trans¬ 
formations. An arbitrary tensor A t k has an invariant that is, 
the sum of the diagonal components. This quantity, which possesses 
only dummy indices and is called the trace of a tensor , is invariant. 
But since tensor F ik is antisymmetric, all its diagonal elements 
vanish (cf. (14.31)). Consequently Fu = 0. 

Thus, there is no linear invariant; but apparently there must be 
a quadratic invariant F ik F ikJ that is, the sum of the squares of all 

13 * 



496 


Fundamental laws 


the tensor components: 

F ih F ih = F\ k = 2(|H|*-|E|*) (15.3) 

This quantity is of fundamental importance for the subsequent 
treatment. 

With the help of a completely antisymmetric tensor of rank 4 
we can construct one more invariant quantity: 

e ihlmF ihF im (15.4) 

Since e ik i m is an invariant tensor, we have, of course, obtained 
a relativistically invariant quantity. Let us expand it with the 
help of the electromagnetic field components: 

E ihimFikFim — 8i (E X H X + E y H y + E Z H Z ) = 8 i (E-H) 

Tensor t ikim has 4! = 24 components. 

We have thus found one more quantity which does not change 
under the Lorentz transformations—the scalar product (E-H). 

Scalars and Pseudoscalars. Vectors and Pseudo vectors. We shall 
now show that the scalar product (E- H) is in a sense not a true scalar. 
Namely, (E-H) reverses its sign if the signs of all the coordinates 
are reversed: x' = — x , y' = —y, z — — z. 

This transformation is known as inversion of the coordinate system , 
or mirror reflection. Indeed, it transforms a right-handed system 
into a left-handed system; but that is just how a right-hand system 
appears in a mirror (the right hand in a mirror appears as the left 
hand). 

It is not hard to see that no rotation of the coordinate axes can 
lead to an inversion. The transformation matrix corresponding 
to an inversion must be written in the form 



The determinant of this matrix is equal to —1, whereas the cor¬ 
responding determinant for a rotation transformation is equal to +1. 
There exists no continuous transformation from +1 to —1. 

Vectors behave in different ways with respect to inversions. The 
velocity vector v = dr/dt changes sign together with r. The momen¬ 
tum vector p = my apparently also changes its sign. The force 

vector F = p possesses the same property. All these vectors are 
known as true, or polar , vectors or simply vectors. 

The angular momentum vector M = r X p, the components of 
which involve the products of the components of r and p, does not 
apparently change sign; nor does the torque vector K = r X F- 
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Such vectors are called axial or pseudovectors : they behave like vectors 
in rotations of the coordinate system, but differently in inversions. 
It was shown in Section 11 that M and K, that is, actually vector 
products of true vectors, should be defined as antisymmetric tensors 
of rank 2. They possess vector properties only in three-dimensional 
space, because the number of components of an antisymmetric 
tensor and a vector coincide. Inversion reveals the nonvector nature 
of M and K. 

It is not hard to see that angular velocity is a pseudovector. It is 
related to angular momentum by the equality 

M a — / a p(Dp 

(see Sec. 9). But the components of the inertia tensor depend on the 
coordinates quadratically and do not change sign in inversions, so 
that in this respect angular velocity behaves like angular momentum. 
This is because the direction of the angular velocity vector was 
chosen arbitrarily (Sec. 8). 

We shall now show that the vector potential A is a true vector. 
It appears in the expression for action (14.23) in a scalar product 
with the velocity vector v, a true vector. From experience we know 
that the equations of mechanics are invariant under inversion: 
their form does not change in the substitution of a left-handed system 
for a right-handed system. Consequently, the scalar product (A-v) 
involved in the Lagrangian should not change sign under inversion. 
For that A must be a true vector. Since (14.23) can be regarded as 
a definition of A, the vector properties of A follow precisely from it. 

But it is then obvious that the magnetic field H is a pseudovector, 
because the del, V = d/dr, is obviously a true vector, which trans¬ 
forms under inversion like r, while the magnetic field 

H = curl A = V X A 

An electric field, defined as E = —V<p — (i/c)(dA/dt), is a true 
vector by virtue of the fact that V and A are true vectors. The scalar 
product (E-H) of vector E multiplied by pseudovector H changes 
sign under inversion, and is thus a pseudoscalar; (E-H) remains 
invariant only in rotations of the coordinate system, but not in 
inversions. 

The four-dimensional tensor definition of (E-H) also indicates 
its pseudoscalar nature: each term includes three spatial indices 
and one temporal, 4 , so that a change in the sign of the three coordi¬ 
nates changes the sign of (E-H). 

One might think that the equations of electrodynamics should 
not be invariant in inversions: since in one case the right-hand rule 
is applied, while in the other it is the left-hand rule. But actually 
these rules derive from the convention of defining the signs—or 
poles—of a magnetic field. The laws of electrodynamics would 
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remain essentially unaffected if we changed their names and at the 
same time interchanged the respective hand rules. 

The law of right and left symmetry is not universal. There is in 
nature a special class of interactions of a nonelectromagnetic nature— 
the weak interactions—in which there is no inversion symmetry. 
This symmetry is an expression of a specific property of a specific 
type of interaction, which can be established only experimentally. 

We have thus restricted ourselves to two invariant quantities: 

| H | 2 — | E | 2 and (E-H). All the others can be expressed in terms 
of them. 

The Field Linearity of Maxwell’s Equations. A true invariant can 
be formed from (E-H) by squaring it. Of course, it is never apparent 
in advance why such a quantity (or the square of the invariant 
| H | 2 — | E | 2 either) cannot enter the expression for the action 
of an electromagnetic field, as well as terms of higher powers. How¬ 
ever, if terms higher than quadratic with respect to the field were 
included, the equations of electrodynamics obtained by action 
variation would involve nonlinear terms, that is, the squares of 
fields, their products, etc. 

The basic difference between a nonlinear and linear equation is 
that the sum of the two solutions of a nonlinear equation is not 
a solution, because cross terms of both solutions appear. But it is 
well known that two electromagnetic waves propagating in a vacu¬ 
um are simply added, without distorting each other. In nonlinear 
theory, the velocity of a wave depends on its amplitude, whereas in 
electrodynamics all disturbances propagate in vacuum with the 
same speed. 

Proceeding from this experimental fact, we must select only the 
quadratic invariant of the electromagnetic field for the field equa¬ 
tions to be linear. With two quadratic invariants we could have 
taken their linear combination in the action expression. But (E-H) 
being a pseudoscalar, the combination would alter the relative sign 
between it and | H | 2 — | E | 2 . If, therefore, we retain (E-H), the 
equations of electrodynamics will be of different form in right-handed 
and left-handed coordinate systems, and no redefinition of the sign 
of the magnetic field can help. 

Consequently, there remains only one quadratic quantity, | H | 2 — 
— | E | 2 , which can be involved in the expression for the action of 
an electromagnetic field. 

Another quadratic quantity, A t Ai = | A | 2 — (p 2 , can be ob¬ 
tained from the vector potential, but it is not guage invariant and 
also cannot appear in the action formula. 

Field-Charge Interaction. We obtained the corresponding expres¬ 
sion (14.22) in Section 14. For a separate charge it has the form 
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S lnt = J ( e/c)A k dx k . It is more convenient to go over from a point 

charge to a spatially distributed one (as shown in Section 12, this 
can always be done). 

To trace the relativistic invariance of the equation, let us show 
that j and p, that is, the current density and the charge density, 
together comprise one four-vector. The charge differential de is by 
definition invariant: 

de = pd (3) x (15.5) 

Let the charges be at rest in the reference frame for which Eq. (15.5) 
was written. We denote this by the subscript 0 for p and d (3) x. Since 
charge is an invariant quantity, we can write the equation 

de = p 0 d <3) T 0 = pc? (3) x (15.6) 


But as was shown in Exercise 5, Section 13, the volume element 
d (3) x 0 moving together with the charged particles contracts relative 
to a stationary volume element: 

d {Z) x = d <Z) t 0 (1 — i; 2 /c 2 ) 1/2 (15.7) 


It follows from this that the charge density p is connected with 
its density p 0 in the stationary reference frame by the relationship 


P 


Po 

(1 — i ; 2 /^ 2 ) 1 ^ 2 


(15.8) 


The coefficient involved here is expressed in terms of the infini¬ 
tesimal interval associated with the motion of the charges (see 
Sec. 14): 

1 _ c dt 

(i — v 2 /c 2 ) {/2 — ds 


Consequently, the charge density can, in turn, be represented 
as the fourth component of a four-vector: 


0 = 0 C ^ Po _ 7 4 

w Vo ds i ds ~ ic 


(15.9 a) 


Then the current density vector is 4 


/a — cp 0 


dx ct 

ds 


(15.9 b) 


We now substitute the differentials p 0 ( dx k /ds) d (3) x for the expres¬ 
sions e dx h in S in t to get 

Sint = 7 -j ih^h d i3) x dt = j j h A h cZ (4, t (15.10) 


4 Note that the Greek indices assume only the values 1, 2, 3. 
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The Variation Principle for the Electromagnetic Field. The prin¬ 
cipal task of this section is to show that, like the equations of me¬ 
chanics, Maxwell’s equations are equivalent to a certain variation 
principle. Although electrodynamics cannot be reduced to a mechan¬ 
ical system of material particles or a continuous medium based 
on Newton’s|laws, there exists a far-reaching analogy between mechan¬ 
ics and electrodynamics based on Hamilton’s principle. This, it 
should be noted, is not a question of a formal, superficial analogy. 
The variation principle makes it possible to define such electro¬ 
magnetic field quantities as momentum and energy, which are 
conserved for it, not separately but together with the respective 
quantities for particles. This is the basis of conservation laws of a more 
general form, which hold in closed systems comprising charged 
particles and the field created by them. 

In defining the action for a system of mass points, summation 
over their coordinates is carried out. To use a term from mechanics, 
an electromagnetic field is a system with an infinite number of 
degrees of freedom, because to define it fully it is necessary to state 
the values of all its components at all points of space where the 
field is not zero. But the points of space constitute an uncountable 
set, that is, they cannot be numbered in any order. Therefore, in the 
case of an electromagnetic field, summation is replaced by inte¬ 
gration over continuously varying parameters: the coordinates of the 
points defining the field. The values of the coordinates are analogous 
to the numbers listing the degrees of freedom of a mechanical system. 

The generalized coordinates of a field are given by the values 
of the vector potential according to the correspondence q k ( t ) — 
—A (r, t). The time derivatives of the vector potential are in¬ 
volved only in the electric field expression. But the time derivatives 
of the coordinates are involved in the Lagrangian of a mechanical 
system only through the kinetic energy, that is, with the positive 
sign. Hence, in substituting the invariant F n Fii into the Lagran¬ 
gian of an electromagnetic field, it must be multiplied by such 
a factor that the electric field be defined by a positive term. 

The numerical value of this coefficient is equal to — 1/(16ji), which 
corresponds to the Gaussian system of units. Sometimes —1/4 is 
taken, in which case it is said to be Heaviside units of measurement 
of electromagnetic quantities. 

Taking into account the interaction of field and charges, that is, 
the term (15.10), we can now write the action for an electromagnetic 
field thus: 

S= ] (~—i r+ J T L ) d(4>TS J (15.11) 

Factoring out the time integration in the four-dimensional volume 
element, we obtain the Lagrangian, which is easily rewritten in 
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three-dimensional notation: 

L== j (' ' 1> 8n |H ' |, ~ t "7~ p(p ) d(3lT (15.12) 

Taking dA/dt as the generalized velocity, we can derive the second 
pair of Maxwell’s equations from the Lagrangian in three-dimension¬ 
al form and then use the matrix (14.31) to determine their rela¬ 
tivistic invariance. 

Instead, however, we shall perform the variation (15.11) directly 
in four-dimensional form. The distribution of currents and charges, 
that is is assumed to be given, so that only A k is varied. For the 
variation of action we obtain 





Account was taken in the variation that F ik = ( dA h /dx t ) — 
— ( dA t /dx k ). In the second term we redesignate the dummy indices, 
so that i becomes k> and k becomes i. Then this term differs from the 
former only in the order of the indices in F hi . But since F hi is an 
antisymmetric tensor, in interchanging the indices we must change 
the sign, after which both first terms in the varied expression are 
reduced. We transform the expression F ih (d&AjdXi) by parts to 
get 


F ih 


d&Afr 

dxi 


d 

dxi 


F ih&A h — 6 A h 


dxi 


We integrate the derivative (d/dXi)(F ik 8A k ) over the correspond¬ 
ing variable d (4) x. As always in developing equations of motion, 
the variation at the integration limits should be put equal to zero. 
We thus finally reduce the variation 8S to the form 


For the requirement 8S = 0 to be satisfied the expression in 
brackets multiplied by the arbitrary variation 6 A h should vanish 
at every point of the four-dimensional volume d (A) x. From this we 
obtain the required equations: 


dF hi 

dxi 



(15.14) 


Substituting the components of the four-dimensional tensor F ki 
from (14.31), we arrive at Maxwell’s equations (12.32) and (12.33). 

To round out the picture let us present the first pair of equations 
in four-dimensional form and also give the equations for potential. 
We write three equations: 


Fik- 


dAk dAi 


dxi 


dx k ’ 


kl z 


dAi dAk 


dx k 


dxi * 


li ’ 


dAi dAi 


dxi 


dx i 
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differentiate the first with respect to x u the second with respect 
to x iy and the third with respect to x k and add them together. Then 
the right-hand sides cancel out, leaving 


dFjh . dFhl | dF u q 

dxi ‘ dxi dxfr 


(15.15) 


It is again not hard to see that these four equations (according to 
the number of ways of selecting three indices out of four) are 
equivalent to (12.30) and (12.31). 

We write the operation 


V 2 


l d 2 
c 2 et 2 


a 2 a 2 . a 2 . e» 

dx\ dx\ + dxi ' Ox 2 


d d d 

as dx 1 Hx ^dx^ ’ sometimes denoted by a square by analogy with V- 

Then instead of (12.43) and (12.44) we obtain for the potential com¬ 
ponent 

d 2 Ak 4 ji 


aA h ==• 


dx ? 




(15.16) 


Finally, the Lorentz condition (12.42) is written in four-di¬ 
mensional form as 

-Mr - 0 (15 - 17) 


The Energy-Momentum Tensor of the Electromagnetic Field. We 
shall now show how Maxwell’s equations (15.14) and (15.15), to¬ 
gether with the equations of motion of a charged particle in electro¬ 
magnetic field (14.33), assure satisfaction of the basic mechanical 
conservation laws: energy, linear momentum, and angular momen¬ 
tum, thereby finally confirming the legitimacy of treating a field 
as a mechanical system. 

After a certain alteration of indices multiply equation (15.14) 
hy F u to get 

F FdFjfr 4 jt Fuji 

li dxh c 


In the obtained equation, redesignate i as k and k as i and add 
both absolutely equivalent equations. We transform the left-hand 
sides by parts, after which we obtain 




d 

dxi 


F, b F 




i~F ik ( 


dFn 


dx h 


. dF h i \ _ 8 jt Fuji 

r dx t ) c 


In combining the two latter terms on the left we made use of the 
fact that 


Fih= — F hl 


F th = — F ki , 
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The first two terms on the left can be reduced if in the second term 
we redesignate the indices in reverse of what we just did. In accord¬ 
ance with the first group of Maxwell’s equations (15.15), we replace 
the quantity in parentheses in the third term by —dF ih !dx l . Then 
this term becomes 


Fik 


dF ik 

dxi 


2 


d p2 _J_ 
dx t nm 2 


6w 


d 

dx h 


F 


2 

nm 


Introducing now the notation 


^Ik = ~4n (^liFki - 4 “ &hlFnm} 


(15.18) 


we write the result of the transformations of the set of Maxwell’s 
equations in the form 

To establish the meaning of this equation, integrate both sides 
over the three-dimensional volume. First consider the right-hand 
side. We go over to point charges, for which we must replace the 
four-dimensional current density vector (see (15.96)) by its expres¬ 
sion in terms of charge density: 


h = cp 0 


dxh 

ds 


and, besides, replace the charge density in its proper reference 
system by its density in the stationary system: 


Then integration over the three-dimensional volume is reduced 
simply to the substitution of pd (3) x by the total charge e. Hence, 
the integral of the right-hand side of (15.19) over the three- 
dimensional volume yields 

4 J w»'<=4 f '<»T 

If we now multiply (14.33) by ds/dt, we find that in the right-hand 
side of (15.19) we have the total time derivative of the momentum 
component p t of the charge or system of charges located in the vol¬ 
ume over which the integration was performed. 

Consider the left-hand side of Eq. (15.19) after integrating over 
the volume. It comprises four components corresponding to the 
dummy index k. In the fourth component the differentiation sign 
with respect to ar 4 is taken outside the integral sign, because dx 4 
is a differential of ict , while the integration is over d (3) x. 
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As we shall soon see, T a4 also involves an imaginary unit, so that 
the fourth term of the integrated expression can be written down 
as the time derivative of the real expression: 

JL f 

dt J ic 

The remaining three terms have the form of an integral of the 
divergence over the volume (cf. (11.48)); only in this case the diver¬ 
gence is taken of a tensor rather than a vector. However, it is appar¬ 
ent from the method used in developing the Gauss theorem that 
it is applicable to the integral of any divergence. Consequently, 
the first three terms of the integrated equation (15.19) turn into 
a surface integral. As a result, for the first three values of Z, that is, 
for the spatial components, we have an equation of the form 

-(4-J %-<**'■*+J r*t“S>)=wP° 

while for the fourth, temporal, component we have a similar equa¬ 
tion: 

-(wS ( ~ Tu) d(3>r+ 1 (- icT *e) ds e) =J ir ( 15 - 206 > 

The equations are written so that all terms should be real, that 
is not involve an imaginary unit. 

iln the left-hand side we have the variation of a certain quantity 
integrated over the volume and added to the flux of a certain other 
vector quantity across the surface bounding that volume. Together 
this is equal to the change in the momentum or energy of the charges 
in unit time. If, for example, the integration surface is so far away 
that the field on it vanishes, the equations acquire a very simple 
form 


W ( Pa + j d^x) = 0 (15.21a) 

4-(s+J(-r4.)d«T)-0 (15.216) 

that is, each of the quantities under the time derivative sign is con¬ 
stant (the partial derivative multiplying the integral can here be 
replaced by a total derivative, because in the present case the 
value of the integral does not depend upon the surface, provided 
the field vanishes on it). 

The system of charges and fields for which Eqs. (15.21a) and 
(15.216) were written is closed: since there is no field on its surface, 
it interacts with nothing (for in electrodynamics there is no such 
thing as action at distance that could be realized through a surface 
far away). In closed systems momentum and energy are conserved. 
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hence we have arrived at the definition of the momentum and energy 
of an electromagnetic field: 

Pa field (15.22a; 

= j (-r 44 )d (3 >x (15.226) 

The integrands represent the momentum and energy densities, 
respectively. 

Now let us return to the case of the field not being zero at the 
surface. If we take the surface so that there are no charges within 
it, Eqs. (15.20a) and (15.206) take the form 

±. j lz L d (3» T + j Tafi dSfi = o (15.23a) 

4- j (- T «) d<3 ’ t + j ^r- = 0 (is.23&) 

The first terms in these equations denote the change in the mo¬ 
mentum and energy of the field in the given volume. Hence, the 
second terms denote the momentum and energy fluxes across the 
surface containing the volume. Equations of similar form are ob¬ 
tained from Eq. (12.18) expressing the charge-conservation law. In 
integral form it is represented in (12.17). 

We have thus revealed the meaning of all the components of the 
tensor T ik . Let us summarize the results of the foregoing reasoning. 

The components J aP represent the flux per unit time, of the field 
momentum component along axis x a across a unit area, the normal 
to which is directed along axis x$. But momentum transported 
in unit time is force. Referred to unit area, it yields the normal or 
tangential stress, depending on whether a and [3 coincide or not. 
That is why the spatial tensor T a $ has a special name, the Maxwell 
stress tensor , which commemorates the fact that it was introduced 
by Maxwell. 

Components T a J(ic) denote the density of the spatial momentum 
component along axis x a . 

Components — icT^ a represent the energy flux density, that is, 
the energy transported by the field in unit time across a unit surface 
the normal of which is directed along axis x a . 

Component J 44 with opposite sign is the energy density of the 
electromagnetic field. 

We shall now express all these components directly in terms of 
the electromagnetic field components. From Eq. (15.18) we 
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find 

T« =-^(F lh F ik -±F! n ) 

=-^(^ lh -4-(l H l 2 -| E l 2 )) 

= -£-(W+H*-E%— i-(|Hp_|Ep)) 

= ±.(IP Z+ Hl-H*+El + El-E%) (15.24a) 

Similarly, we determine the other components of T a p with equal 
spatial indices. Components with different spatial indices are found 
as follows: 

= (^ 13 ^ 23 + ^ 14 ^ 24 ) = — 4 n (15.24 b) 

Components with 4 in the index are equal to 

T„ = T« = -L- (E y H z — E z Hy) = -^ (E X H)* (15.24c) 

Hence, the density of the momentum of the field along the x axis 
is equal to 

4 HT( e XH)* (15.25) 

while the density of the energy flux along the x axis is 

^(EXH), (15.26) 

This vector has a special name, the Poynting vector. Finally, the 
energy density is 

-7* 4 = -4t( f ^-4-0 h I 2 -I e I 2 )) 

= -l-(|Ep + |Hp) (15.24d) 

Let us now write all the tensor components, multiplied by 4n, 
in the form of an array (they should be seen as arranged four in 
a row): 

JL { El + El-E%+Hl + m-H%), - (E x E y + H x H y ), 

~(E X E Z + H X H Z ), i (E X H)* 
- (E X E V + H x H y ), -L (El + El-EI + HI + HI - HI ), 

~(E y E z + H y H z ), i( EXH)„ 
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— ( E X E Z + H X H Z ), — ( E y E z + H y H z ), 

^(El + El-El + Hl + Hl-HI), i(EXH) z 

i (E X H)*, i (E X H)». i!(EXH) z , 

__|EJ ! +|HJ ! _ (15 27> 

The sum of the diagonal elements, or trace, of this tensor is zero. 

Angular Momentum of the Electromagnetic Field. Let us write 
Eqs. (15.19) for two different spatial indices, a and |3, multiply 
them by and x a , respectively, and subtract one from the other: 

x a = ^ ( x &Fakih x aF&klh) 

We transform the left-hand side by parts to get 
0 ^“ ( X P Tah — x a T 3 ft) — 8 ft 3 T ah -\- 8 ha Tp h 

( x f>Tak—XaTfik) 

Thanks to the symmetry of tensor T a p , the terms outside the de¬ 
rivatives cancel out. 

Now integrate both sides of the equation over the three- 
dimensional volume as was done in obtaining the relationships 
(15.20a) and (15.206). Transforming the right-hand side in the 
same way as before, we reduce it to the form ' 

€ * ... 

— (XfiF ak X h — X a Ffi h X h ) = XfiPa — XaPi 

Now, taking into account that p a = x a (i — y 2 /c a )" 1/2 and — 

= x$(l — y 2 /c 2 ) _1/2 , we represent the difference x$p a — x a p$ in 
the form of the total derivative of x$p a — £ a pp, that is, of the 
angular momentum component of a charged particle with an index 
not equal to a or p: 

A/y = Cya 0*£(x.P0 

But we then see that on the left we have the time derivative of 
the angular momentum of the electromagnetic field, which can 
easily be expressed with the help of (15.27) as 

Mfieia = j ^-[rX(EXH)]i l3 ’T 


( 15 . 28 ) 
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It follows from this that the integrand 

4HT[rX(EXH)] (15.29) 

is the density of the angular momentum of the electromagnetic 
field. 


EXERCISES 

1. Defining tensor F*h = ^ihim^lmt obtain the equation for it from 
(15.15). Write the components of Fih in matrix form. 

2. Develop the energy conservation equation in three-dimensional form 
from (12.30) and (12.32) (the Poynting theorem). 


16 


ELECTROSTATICS OF POINT CHARGES 

Slowly Variable Fields. An important class of approximate solutions 
of electrodynamical equations comprises slowly variable fields, for 
which the time derivatives in Maxwell’s equations can be neglected. 
The remaining terms form two sets of equations, which are entirely 
f independent of each other: 


div E = 4jtp 

(16.1) 

curl E = 0 

(16.2) 

div H = 0 

(16.3) 

curlH = — j 

C 

(16.4) 


The first two equations contain only the electric field and the 
density of the charge producing the field; the second two equations 
involve only the magnetic field and current density, the right-hand 
sides of the equations being regarded as known functions of the 
coordinates and time. Since there are no time derivatives in (16.2) 
and (16.4), the time dependence of the electric field is the same as 
the charge densities, and the time dependence of the magnetic field 
is the same as the current densities. Hence, to the approximation of 
(16.1)-(16.4), the field is, as it were, established instantaneously, 
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in correspondence with the charge and current distribution that 
generated it. 

The fact is that any change in the field is transmitted through 
space at the speed of light c. If we consider the field at a distance R 
from a charge, the electromagnetic disturbance will reach it in 
a time R/c. The charge with a velocity u will be displaced during 
that time through a distance uR/c. The approximation (16.1)-(16.4) 
can be applied only when the displacement uR/c does not lead to 
any essential redistribution of the charge. For example, let a system 
consist of two equal charges of opposite sign which change places 
in the time R/c. Then, at the distance R, at the instant t = R/c, 
the electric field will have a direction opposite to the one it would 
have in the instantaneous propagation at the instant t = 0. 

Hence, if the dimensions of the system of charges are r and their 
velocities v (in order of magnitude), then Eqs. (16.1)-(16.4) can be 
used at a distance from the system for which the inequality r/u 
R/c, or R rc/v, is satisfied. 

Suppose v c. Then the region of applicability of our set of 
equations will be sufficiently large. 

Equations (16.1) and (16.2) are called the equations of electrostatics, 
and (16.3) and (16.4), the equations of magnetostatics. 

Scalar Potential in Electrostatics. In order to satisfy Eq. (16.2), we 
put 


E = —grad cp 


According to (12.35), cp is the scalar potential. The equation for 
the scalar potential is obtained from (16.1) 

div grad cp = V 2 <p = —4np (16.6) 

which also follows from (12.44), if we equate to zero the nonstatic 
term (l/c 2 )(d 2 <p/<9£ 2 ). 

Let us find the solution to Eq. (16.6) for a point charge, that is, 
we put p equal to zero everywhere except at the origin of the coordi¬ 
nate system. Then cp can depend only on the distance from the ori¬ 
gin, r. 

In Section 11 an expression for the Laplacian was obtained in 
spherical coordinates (11.46). In the special case, when the required 
function depends only on r, we obtain from (11.46) 


r 2 


d_ 

dr 



— 4:rip 


(16.7) 


Let us integrate this equation between r x and r 2 , first multiplying 
it by r 2 . Since the region of integration does not contain the origin, 
where the point charge is situated, the integral of the right-hand 


14-0452 
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side becomes zero. Hence 



Therefore the potential is 
<j>=—i+fl 



= A = constant 


The constant B is equal to zero if we take the potential to bo equaJ 
to zero at an infinite distance away from the charge. Let us now 
determine the constant A. For this, we integrate Eq. (16.6) over 
a sphere surrounding the origin. Since the Laplacian V a qp = 
= div grad qp, the space integral can be transformed into an integral 
over the surface of the sphere. This integral is 

j grad cp dS = ^ r 2 c?Q = 4jiA 
In the right-hand side we have 
— j 4np dV = — 4ne 

since the integration region includes the point where the charge 
is situated, that is, the origin of the coordinate system. Thus A = —e 
The potential of a point charge is thus 


<P = f (16.8) 

We obtain the same result for a spherically symmetrical volume- 
charge distribution, if the potential is calculated outside the volume 
occupied by the charges. In other words, the potential of a charged 
sphere at all external points is the same as the potential of an equal 
point charge situated at the centre of the sphere. A similar result 
is obtained for the gravitational potential, because Newton’s gravi¬ 
tational law in form resembles Coulomb law. This fact is used in 
most astronomical problems, where celestial bodies are considered 
as gravitating points. 

If the origin does not coincide with the charge, and the charge’s 
coordinates are x, y, z, that is, the charge is located at a point with 
radius vector r, the potential at point X, Y, Z (radius vector R) is 

_ e _ e 

<P_ |K —r| — [(* — X )2+(Y— y)*+(Z—i)*] 1 ' 2 


e 

[(*«-*«) (Xa-*a )) 112 


(16.9) 


The Potential of a System of Charges. Since the equations of elec¬ 
trodynamics are linear, the potential produced by several charges 
equals the sum of the potentials of each charge separately. If the 
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radius vector of the ith charge is r*, then the total potential of the 
system is 

(p = V_ e J _= y _ e l _ 

9 I R —r*| ^ l(Jf«-4)(X«-4)l 1/2 

1 1 V*. 

But, to save space, in future we shall write ( X a — x l a ) 2 instead 

of (X a — Za)(X a — %a)' Then the potential at a point with radius 
vector R is 

<p=2«i[(X«-a&)T 1/2 (16.10) 


remembering that inside the brackets we sum over a from 1 to 3. 

Suppose now that the origin of a coordinate system is located 
somewhere inside a domain occupied by charges, for example, the 
centre of the smallest sphere embracing all the charges. We shall 
look for the potential at a large distance from the origin, that is, 
at a distance R for which all the inequalities 

R > r* (16.11) 


are satisfied. 

In other words, we must determine the potential at a large dis¬ 
tance from a system of charges. For that we should expand function 
(16.10) in a Taylor series in powers of x x a . We shall perform the 
expansion up to the quadratic term, but shall write it first for only 
one term of the summation over all the charges, omitting the index i 
for brevity: 

kx 0 -*«)T 1/2 =[X£r 1/2 [xir m 


The summation convention permits writing in concise form the 
Taylor series for a function of several variables. Since X% = R\ 
we obtain the expression for the first derivative: 


d ry 2 l“ 1/2 d 1 _ dI * d * 
dXp 1 aJ ” dX$ R dR R 


W (16-13) 


where we have used Eq. (11.34), which in the notation of this sec¬ 
tion, is of the form 

dR _ 

R 


Thus, the term in the sum (16.12), linear in. is equal to 


*6*0 rR 

R3 — R3 


(16.14) 


14 * 
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It is somewhat more difficult to calculate the term which is quad¬ 
ratic in Xa. We first write the second derivative: 


d* 1 
OXpdXy R 


d x & 1 dX & Y d 1 

OXy R* ““ dXy Ap dXy R 3 

y 0 1 
i?3 "T" 3 dX Y #3 


in accordance with the general definition of tensor 63 v in Section 9. 
Further 

d 1 OR d 1 _ 3 _ 3X V 

~ox^ ~"Jx^ ~ r~^~~ W~ 

by the general rule for differentiating a composite function, in this 
case the function 1 AR 3 . 

Thus we obtain 


5 * l 

dXfi dXy R 


^ 0 Y 

i?3 


3XflXy 

R5 


Finally, the expansion | R — r | _1 in powers of components of r 
has the form 


1 1 , rR , 1_/ 3XpX v 63 Y \ 

| R — r | — R3 + T x & x v\ R5 — R 3 ) 


(16.15) 


We now subtract from the last term in (16 15) a quantity iden¬ 
tically equal to zero: 


1 

“g” ^a^a' 


3XpX v 6p v \ 1 / 3XpXp 6 9eT_ n 

i?5 & ) — 6 XaXa \ Ri ~~ R3 ) =U 


since XpXp = ft 2 , and 6 p p = 3. Then the expansion (16.15) takes 
the form 

1 1 rR 

| R — r | — R + i?3 

+ 4 ~ ( X t X V — -g- fyy*<x*« ) ( ) (16.16) 


This series must be substituted into the potential (16.10) and 
summed over all the charges. We introduce the following abbrevi¬ 
ated notation: 

i 

2 e i g- 6 «p*^ Y ) 


(16.17) 

(16.18a) 
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or in terms of the components: 


— 2 S ( x ' 2 3 ) ’ 

i 

eixi y i 

i 

%y — ~2 2 e i {y ,2 3~~ ) ’ 

i 

Qxz = ~ 2 ~ 2 ^i x z ' 
i 


Qvz = \ 2 e iV izi 


i -i 


(16.186) 


The vector d (the three quantities d x , d yj d z ) and the six quan¬ 
tities q XXJ q yyj q ZZJ q xyj q xz , q yz depend only on the charge distri¬ 
bution in the system, and not on the place at which the potential 
is determined. In the notation of (16.17) and (16.18a) the potential 
at large distances away from the system is 

1 vi - i dR , / 3X a X$ ^ /iic iin\ 

*P 2 ^ R3 V jR5 / (16.19) 


since 


<7ap6a0— Qaa— 2 2^ ( X a X a ^*3 X a X a) —^ 
i 

where the terms with different indices, of the type q xy , actually 
appear twice in the summation (for example, q xy and the equal 
term q yx ). 

The vector d is called the dipole moment of a system of charges; 
the tensor of rank 2, g aP , is the quadrupole moment ofjthe system . 

Dipole Moment. We shall now examine the expression^ (16.19) 
obtained for potential. The zero term 2 e i/R corresponds to the 
approximation according to which the whole charge is considered 
to be concentrated at the origin. In other words, it corresponds to 
a substitution of the entire system of charges by a single point 
charge. 

This approximation is clearly insufficient when the system is neu¬ 
tral, that is, if = 0* This case is very usual, since atoms and 
molecules are neutral (their electronic charge balances the charge 
of the nuclei). 

Let us assume that the total charge is equal to zero and then con¬ 
sider the first term of the expansion involving the dipole moment. 
This term decreases like i? -2 , that is, more rapidly than the poten¬ 
tial of a charged system. Besides, it is proportional to the cosine 
of the angle between d and R. The simplest way to produce a neutral 
system is by taking two equal and opposite charges. Such a system 
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is called a dipole. Its moment is 

d =2«J> ,i = «(r + — r") (16.20) 

i 

in accordance with the definition used in general courses of physics 
that the dipole moment is the product of the charge by the vector 
drawn from the positive to the negative charge. 

It can be seen from Eq. (16.20) that the definition of dipole mo¬ 
ment does not depend on the choice of coordinate origin, since it 
involves only the relative position of the charges. We shall show 
that the dipole moment always possesses this property, provided 
the total charge of the system is zero. 

Indeed, if we displace the origin through some distance a, then 
the radius vectors of all the charges change to 

r 4 = r' 4 + a 

Substituting this into the expression for the dipole moment, we 
obtain 

d = 2 ^r 4 = 2 ^r' 4 + a 2 e i = 2 *i r ' 4 = d ' (16.21) 

i i i i 

because 2^< = 

But if the system is not neutral, then we choose a in the following 
manner: 

a = (2*1*0 (SO -1 (16.22) 

i i 

This choice is analogous to the choice of centre of mass for a system 
of masses. Thus, we can say that in a system which is not as a whole 
neutral the vector a determines the electrical centre of the system of 
charges. For a neutral system it is impossible to determine a, since 
the denominator of (16.22) is zero. If for a charged system we choose a 
according to (16.22), then 2 e *J fi = 0, that is, the dipole moment 
of a charged system relative to its electric centre is equal to zero. 

We thus have the following alternatives: either the system is 
neutral, and then the expansion (16.19) begins with a dipole term 
independent of the choice of coordinate origin, or e = 2 e i is a re- 

i 

sultant charge, and then the dipole term in the expansion is equal 
to zero for a corresponding choice of origin* 

Quadrupole Moment. In the expansion (16.19) we now consider 
the second term containing the quadrupole moment. A quadrupole 
is a system of two dipoles of moment d, which are equal in magni¬ 
tude and opposite in direction. It is clear that a potential expansion 
for such a system will have neither a zero nor a first term, so that 
Eq. (16.19) contains only a second term on the right-hand side. 
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The simplest quadrupole can be formed by placing four charges 
at the vertices of a parallelogram, where the charges are of equal 
magnitude but with pairs of charges having opposite signs. The 
charges alternate when we traverse the vertices of the parallelogram. 
Such a system is neutral. However, a charged system, too, can have 
a quadrupole moment. It indicates to what extent the charge distri¬ 
bution in the system differs from spherical symmetry. 

Indeed, in this section it was shown that the potential due to 
a spherically symmetrical system of charges decreases in strict 
accordance with a /? _1 law, and the potential due to a quadrupole 
follows a R~ z law. For this reason, the quadrupole term in the po¬ 
tential expansion can arise only in the case of a nonspherical charge 
distribution. The subsequent expansion terms, which can be ob¬ 
tained in the same way, take account of increasingly fine deviations 
from spherical symmetry in the charge distribution. 

Let us now determine in what sense the quadrupole moment 
characterizes a nonspherical distribution. Equation (16.22) estab¬ 
lishes the analogy between the centre of mass of a system of masses 
and the electric centre of a system of charges. Similarly, Eq. (16.18a) 
allows us to establish a certain correspondence between the compo¬ 
nents of a quadrupole moment and the moments of inertia of the 
system of masses defined by Eqs. (9.3) and (9.4). 

Since we are concerned with the similarity between, not identity 
of, quantities, we can disregard the fact that (16.18a) involves a sum¬ 
mation, while (9.3) involves an integration. Besides, this difference 
disappears if we take a continuous charge distribution or discrete 
mass distribution (as of nuclei in a molecule). Furthermore, we shall 
forget for the moment that moment-of-inertia components involve 
masses, not charges. 

The tensor expression for moment of inertia has the form 

m 

We put a = P and perform the summation according to the gen¬ 
eral rule. We then obtain 

Icl a — 2 ^ (3x a x a “““ XqlXql) — 2 2 mx a X a 

m m 

We substitute 2 mx a x a into the initial expression to get 

m 

2 = ~2 ^aP^vv — -^aP 

m 

Knowing how ^mx a x a and £mx a x p are expressed in terms of 
I a (j, we substitute them into the definition of a quadrupole moment: 

1/1 \ 

0aP~"2 (-^Sap/vY — -Jap) j(16.23a) 
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The symbol ~ serves as a reminder that what we have is only a cor¬ 
respondence. 

In Section 9 it was shown that moments of inertia can be reduced 
to the principal axes, that is, a coordinate system can be found 
in which the products of inertia vanish, leaving only the diagonal 
elements of the inertia tensor. But, since the relationships between 
q a p and I a $ exist in any coordinate system, for the same principal 
axes, the quadrupole moment components with different indices 
also vanish. The quadrupole moment relative to the principal axes 
is made to correspond with the moment of inertia as follows: 

9i~-g-( / * + / 3- 27 *) (16.236) 

(and for other components similarly). 

If the system possesses spherical symmetry, then I x = / 2 = / 3 , 
so that ?i = = ?3 = 0- Therefore, the presence of a quadrupole 

moment in a system of charges indicates that the charge distribu¬ 
tion is not spherically symmetrical. However, a reverse assertion 
would not be true: if the quadrupole moment is equal to zero, the 
system of charges may not be spherically symmetrical. It is neces¬ 
sary, in expansion (16.19), to take into account terms of higher order 
than written here, and only if they are all equal to zero do we have 
spherical symmetry. Only then does the potential decrease strictly 
as R~ l . 

It will be noted that from the definition of a quadrupole moment, 
or from (16.236), there follows directly the identity q aa = 0,^i + 
+ ?2 + £3 = 0 ^ that only two of the three principal components 
of a quadrupole moment are independent. 

The relations (16.23a) and (16.236) should be regarded literally 
if we are talking about a gravitational potential. We know that 
the earth is not strictly spherical, but is flattened at the poles. There¬ 
fore, the force of gravity contains terms that decrease faster than 
the inverse square law. This affects the motion of the moon and, 
even more so, that of artificial satellites, which are closer to the 
earth. For them Eq. (16.19) would have to be written up to a higher 
approximation. 

Equations (16.236) become simpler if two moments of inertia 
of the system are equal, that is, if the system’s symmetry requires 
the equality Ii = / 2 . Then 

ji 

51 

#3 ~ y (^i — = <7 
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In this case the quadrupole moment has only one independent 
component q. Its sign is called the sign of the quadrupole moment. 

The quantity q = 2^i ( 2 — r i2 /3). 

If the charges were distributed with spherical symmetry, we 
would have the equality 2 e i ri2 = 32 e i zi2 i for then 2 e i xi2 = 
= 2 e ty i2 = 2 e i zi2 f° r an y choice of axes. Then, obviously, q , 
too, would be equal to zero. 

The positive sign of q shows that 2 > 2 e i ri2 l 3, that is t 

it indicates a charge distribution extending along the z axis; a nega¬ 
tive sign indicates a flattened charge distribution. 

From (16.19), the potential due to such a quadrupole with one 
component q is 


1 

/ 

3X 2 

1 \ i / 

' 3 Y 2 

1 ( 

3Z a 

1 \ 

^<7- —~2 



i?3 ) 2 q \ 

, /? 5 


R b 

i?3 ) 

3 


x 2 -f 

Y 2 — 2Z 2 \ 

3 / 

Rl — ZZ 2 \ 



2 

q{ 

) ~ 

"T?( 

R b ) 



3 

~ 2 

Q 

-a- 

- 3 cos 2 0) 




(16.24) 


The potential of such a quadrupole depends on the angle $ accord¬ 
ing to the law 1 — 3 cos 2 d, where d is the angle between the axis 
of symmetry of the quadrupole and the radius vector of the point 
at which the potential is determined. Such deviations from spheri¬ 
cal symmetry have been found in the electrostatic potential of many 
nuclei. The quadrupole moments of nuclei give us an insight into 
their structure. 


The Energy of a System of Charges in an Electrostatic Field. We 

shall now calculate the energy of a system of charges in an external 
electric field. The potential energy of a charge in a field is equal to 
U = ecp, because the force acting on the charge is equal to F = 
= —grad U = — e grad cp = eE. The energy of a system of charges 
is thus 

^ = 26^(0 (16.25) 

i 

where r { is the radius vector for the ith charge. 

Let us suppose that the field does not change much over the space 
occupied by the charges, so that the potential at the site of the iih 
charge can be expanded in a Taylor series: 


q>(r)-q>(0) + * o '(^-) 0 + y*«a p )„ + ••• ( 16 - 26 > 


We transform the last term in the same way as in the expansion 
(16.15), taking advantage of the fact that <p is the potential of the 
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«external field (and not the field produced by the given charges), 
so that V 2 <p = 0. We subtract from cp the quantity (r 2 /6) V 2 <p equal 
to zero. Then, after summation over the charges, we obtain 


*7 = q>(0)2e«-(d«E,) 

t 


+ 2 2 { X a X b 3 ^ X y X y ) ( dx a <L e 

t 

= «P(O)Se t -(d.E o )+?« P (^^ r ) 0 +... 



(16.27) 


Here, the value of the field (grad cp) 0 = —E 0 at the origin has 
been substituted into the term involving the dipole moment. Relat¬ 
ing Eq. (16.27) to the principal axes of the quadrupole moment, 
we can rewrite it as follows: 


t/ = q>(0)2«f-(d-Eo) 

‘ <«•*> 

In the case of a neutral system, the term involving dipole moment 
is especially important. The quadrupole term accounts for the exten¬ 
sion of the system, since it involves field derivatives. If the system 
is spherically symmetrical, that is, if it has a quadrupole moment 
equal to zero, there is no correction for extension. Higher order 
corrections are also absent, so that the potential energy will always 
depend only on the value of the potential at the centre. This is why 
spherical bodies not only attract, but are also attracted, as points 
Of course these assertions are mutually related by Newton’s Third 
Law, which holds for electrostatics, since fields are determined by 
the instantaneous configuration of charges. 


EXERCISES 

1. Show that the mean value of the potential over a spherical surface 
is equal to its value at the centre of the sphere, if the equation V 2 ^ = 0 
is satisfied over the whole volume. Relate this to the result obtained for 
the potential energy of a spherically symmetrical system of charges in an 
external field. 

Hint. The potential should be expanded in a series involving the powers 
of the sphere’s radius. In integration over the surface, all the terms contain¬ 
ing x, y, and z an odd number of times become zero. The terms containing 
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x, y, and z an even number of times can be rearranged so that they are 
proportional to V a <p, V a V a q), and so on. There remains only the zero term 
of the expansion, which proves the theorem* 

2. Calculate the electric field of a dipole and the interaction energy 
of two dipoles located at a large distance from each other. 

3. Reduce to quadratures the problem on the motion of a charged 
particle in a dipole field in the nonrelativistic approximation. 

Hint. Use the Hamilton-Jacobi equation and separate the variables. 
The integral over the angle cannot be expressed in elementary form. 


17 


MAGNETOSTATICS OF POINT CHARGES 

The Meaning of the Equations of Magnetostatics. In the preceding 
section it was shown that if the velocities of charges are small in 
comparison with the speed of light, then the magnetic field satisfies 
the set of equations (see (16.3) and (16.4)): 

div H = 0 (17.1) 

curlH = -^- i = ~ L P v (17.2) 

They are called the equations of magnetostatics. However, it is 
not hard to see that they cannot be directly satisfied for point 
charges. For this we take the divergence of both sides of (17.2). 
On the left we have identically div curl H = 0, while on the right 
we obtain (4n/c) div pv. For point charges it is impossible to make 
div pv equal to zero simultaneously at all points of space, because 
there are no closed current lines. Closed lines can appear only when 
the paths of the charges, that is, their motions over a specified time, 
are considered. Therefore Eq. (17.2) has no meaning for instanta¬ 
neous values and becomes meaningful only for values averaged over 
time. 

If the time averaged value is determined from (10.48), it is not 
necessary to consider the charge paths to be closed in the strict 
sense; it is sufficient for them to be finite, or even for the averaged 
quantity integrated over time to increase not faster than the time 
itself. 

Along the lines of (10.48), we define the mean of a certain function 
averaged over time by 

J= i;\ /(Fi y)dt 


(17.3) 
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This averaging operation is commutative with a differentiation 
of the function with respect to the coordinates, since it is performed 
with respect to a different variable, time. 

Let us perform the averaging on the quantity div j. Taking advan¬ 
tage of the commutativity of the operation div curl, that is, of the 
determination of the partial derivatives with respect to the coor¬ 
dinates, with integration over time, we rewrite Eq. (17.2) in the 
mean form: 

<o to 

div curl H = [ div ydt = —dt * 

to J to J Ot 

0 0 


p(*o) —P(0)3 

*0 


(17.4) 


Suppose now that the difference p(£ 0 ) — p(0) increases slower 
than the time interval t 0 itself. This is valid a priori for any quasi- 
periodic motion (see Sec. 10). If t 0 is chosen sufficiently large, the 
ratio [p(£ 0 ) — p(0)l/£ 0 becomes infinitesimal in the limit. Thanks 
to that, the mean value of the current indeed satisfies the equation 




(17.5) 


Hence Eq. (17.2) and all subsequent equations of this section 
should be understood in the sense of time averages, which will be 
denoted by bars over the quantities referring to the motion of charges. 
We shall further agree not to draw a bar over H, though it is implied. 

If the condition (df/dt) = 0 is satisfied not only for the charge 
density, but for all functions relating to the motion of the charges, 
such motion is said to be stationary or steady. 

A special case of steady motion is periodic motion, for example, 
uniform circular motion. But for a steady state it is sufficient for 
the charges simply to be moving in a limited region or to be receding 
at a rate slower than time increases. 


The Vector Potential Equations. In order to satisfy Eq. (17.1), 
we put, as in the most general case, 

H = curl A (17.6) 

(cf. (12.34)); here A is the vector potential. Equation (17.6) does 
not fully define A, because if we add to A the gradient of an arbitrary 
function /, as in (12.36), the expression for curl A will not change. 
Therefore an additional condition must be imposed on A. The Lo- 
rentz condition (12.42) suggests that we should require 

div A = 0 


(17.7) 
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Then, substituting into (17.6) and (17 2), we obtain 

curl H = curl curl A = j (17.8) 

But from (11.42) 

curl curl A = grad div A — V 2 A = —V 2 A (17.9) 

where we have made use of condition (17.7). Hence, A satisfies the 
equation 

V*A=—£j=—(17.10) 


which is entirely analogous to (16.6) for a point charge. 

Equation (17.10) can also be obtained from (12.43), if we discard 
the term (l/c 2 )(d 2 A/d£ 2 ), which is inessential in magnetostatics. 

Since the solution of (17.10) looks exactly like the solution of 
(16.6), let us write it in a form analogous to (16.9) for a separate 
point charge. Each component of A satisfies an equation of the 
form (16.6), with the difference that in the right-hand side we have 
the vector j (/*, j y , j z ). If (17.10) is expanded in components in 
Cartesian coordinates, we obtain three equations of the form (16.6) 
having in the right-hand sides —(4 n/c) pv x , — (4n/c) pv y , and 
— (An/c) pv z , respectively. Hence the vector potential of a point 
charge is 


A = 


ey 

c I R — r l 


(17.11) 


We shall now show that A satisfies condition (17.7). The diver¬ 
gence must be taken with respect to the radius vector at the point 
at which A is determined. 

But grad R |R — r | _1 = —grad r |R — r I' 1 , so that 

A = 7 v grad R j .^ 1 v grad, 

_ e d 1 

c dt | R —r | 

The expression in the right-hand side is the total time derivative 
of the quantity | R — r I' 1 . From the steady-state condition, it is 
equal to zero. 

We shall now calculate the mean magnetic field of a point charge. 
Using Eq. (11.28), we obtain 


H = curl A = ± ( grad R y X v ) 


-.\ R—r|3 


vX(R-r) 


(17.12) 
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This equation refers, of course, only to steady motion. In particu¬ 
lar it is applied to direct current* 


Vector Potential at Great Distances From a System of Stationary 
Currents. The vector potential for a system of point charges is equal 
to the sum of the vector potentials for each charge separately: 




c | R — r i | 


(17.13) 


We shall now obtain approximate formulas valid at great dis¬ 
tances from the system, similar to those obtained in electrostatics. 
For this we substitute into (17.13) the expansion (16.15), in which 
only the first term linear with respect to r i has been retained: 


1 


r ? ’R 


| R—r* | 'R ' & ( 17 - 14) 

The vector potential in the approximation (17.14) is of the form 



i 



2 


i i 

since r* = v\ The zero term of the expansion is a total time deri¬ 
vative and vanishes after averaging. We now transform the first 
term of the expansion, using the identity 




= 2M R - r V+2M R - v V (17-16) 

i i 

From this it follows that into (17.15) we can substitute half the 
difference of the expressions in the right-hand side of (17.16). Then 
the vector potential will be 

A= 2 MvV-RJ-rV’R)) 

i 

= —2 e * R X ( r ‘X v< ) (17.17) 

i 

We now interchange the signs of the summation and vector pro¬ 
duct and obtain the required equation: 

A---DrXS-S-fr'Xv') 


(17.18) 
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The quantity appearing under the summation sign in (17.18) 
depends only on the mean current distribution in the system: 

(17.19) 

r* 

It is called the mean magnetic moment of a system of currents. Using 
the notation (17.19), we rewrite the vector potential at a large dis¬ 
tance from a system of currents in the following form: 

A = —X f* = grad jXJ* (17.20) 

Let us now calculate the magnetic field from the vector potential. 
By definition 

H = curl A = curl ( grad X I 11 ) 

Since jul is a constant vector, Eq. (11.30) gives 
H = (fi • V) grad -1 — fiV 2 -^- = ~ 

because V 2 7? _1 = 0. Further, (jui-V) R = jn (see (11.3G)). In order 
to calculate (jn-V) 7?" 3 , we use Eq. (11.34). This yields 

(^• v )-p- = (i I -g rad i-)= —Jr(£-gradi?) = — 

Finally, collecting both terms, we arrive at an equation for H: 


H = 


3R (R-n) — 


(17.21) 


For comparison, we present the expression for the electric field 
of a dipole: 

j (R d) 3R(R d) — R 2 d OON 

E= -gradcp- -grad^-p-^ =-- (17.22) 

Thus, both expressions for physically observable quantities, that 
is, electric and magnetic fields, are entirely analogous. The electric 
and magnetic moments determine the corresponding fields in a simi¬ 
lar way. This explains the name “magnetic moment”. 

In the case of a charge moving in a plane closed orbit, the defini¬ 
tion of magnetic moment (17.19) coincides with the elementary 
definition of moment in terms of “magnetic sheet”. The vector pro¬ 
duct r X v is twice the area swept out by the radius vector of the 
charge in unit time (see Section 5, following Eq. (5.4)). Hence 
r X v = 2 dS/dt. By definition of the mean value (17.3) 





(17.23) 
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Here, t 0 is the time of orbital revolution of the charge. In this time 
the charge passes every point on the orbit once; hence the mean 
current is equal to I = e/t 0 . This yields the elementary definition 
of a magnetic moment: 

pi=~ (17.24) 

The similarity of Eqs. (17.21) and (17.22) is proof of the equiv¬ 
alence of a closed current, that is, a “magnetic sheet”, and a ficti¬ 
tious dipole with the same moment jn. At a large distance from a sys¬ 
tem of currents the field is produced as it were by a dipole. 

A System of Moving Point Charges in an External Magnetic Field. 
The approximation of slowly varying fields examined in this sec¬ 
tion holds only when the charge velocities are small in comparison 
with the speed of light. It is of interest to investigate the motion of 
such charges in an external magnetic field. For this we must develop 
a nonrelativistic approximation of the Lagrangian of a system of 
moving charges. 

If in a nonrelativistic approximation we substitute the kinetic 
energy m | v | 2 /2 for — mc 2 ( 1 — | v | 2 /c 2 ) 1/2 in the exact Lagran¬ 
gian (14.23), the latter acquires the following form for a system of 
identical particles (e t = e, m t = m): 

£-5: ^^+2 [f (A(„.V,)-«,(„)] *(17.25) 

i i 

Owing to the term linear with respect to the velocities, in this 
approximation, too, L is not represented as the difference between 
the kinetic and potential energies, L = T — U. But the Lagran¬ 
gian can be reduced to its usual form by means of a corresponding 
change in the reference frame. 

We assume the external field to be uniform: H = constant. The 
vector potential of such a field is conveniently represented in the 
form 

A=l(HXr) (17.26) 

since, from (11.30), 

curl - J (H X ; r) = % [H div r — (H • Vjjr] = H 

Z Z HI 

Substituting this expression into the Lagrangian and performing 
a cyclic permutation in the mixed vector product, we obtain 

L = 2 -2^-+ 2 (r, X v,) H — e<p (17.27) 
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Compare this expression with the Lagrangian (8.5) for the motion 
of material points with respect to a rotating reference frame. 

If we assume the angular velocity of the system to be small, so 
that its square can be legitimately neglected in comparison with 
the terms involving it linearly, the Lagrangian (8.5) for a system 
of identical particles can be rewritten as follows (after the cyclic 
permutation employed in writing (17.27)): 

i= 2^r I +S m ( r iXv,)«-i/ (17.28) 

i i 

Comparing (17.27) and (17.28), we find that the motion of a sys¬ 
tem of charges in an external constant and uniform magnetic field 
coincides with its motion relative to a reference frame rotating 
with a constant angular velocity 

£ < 17 - 29 > 

This assertion is known as Larmor's theorem . 

We assume the frequency to be much smaller than all the fre¬ 
quencies characterizing the proper motions of the charges in the 
same system in the absence of an external magnetic field. This 
motion remains unchanged after the magnetic field is turned on, 
if we go over to a reference frame revolving with the angular velo¬ 
city — a). The terms of the Lagrangian linear with respect to the 
velocity and the terms due to the magnetic field and the rotation 
mutually cancel out. This, of course, is valid only in the linear ap¬ 
proximation, when the effect of the centrifugal force, which is quad¬ 
ratic with respect to the angular velocity, can be neglected. This is 
equivalent to the assumption of the smallness of in comparison 
with the proper frequencies of the system. 

Let a system of like charges possess a magnetic moment defined 
in accordance with (17.19). Taking the charge e outside the sum¬ 
mation sign, we observe that in such a system the magnetic moment 
is proportional to the angular momentum (see (4.20)): 

^ = e 23T( r *XVi) = ^-M (17.30) 

In such a system the magnetic moment and the angular momentum 
are conserved together. In an external magnetic field, the system’s 
rotation about the field is of the same type as the rotation of a free 
symmetric top about the direction of the total angular momentum. 
For that reason, the motion of a system possessing magnetic moment 
in an external magnetic field is, like the similar motion of a top, 
called precession. It takes place with a frequency defined from (17.29), 
and is accordingly known as Larmor's precession. 

15 -0452 
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Let us write the equation for this precession. Whereas in the 
absence of an external magnetic field the angular momentum vector 
M was constant relative to fixed axes, according to Larmor’s theo¬ 
rem, in a field it is constant relative to axes rotating with the angu¬ 
lar velocity —eH/(2 me). The components of M along these axes are 
• 

constant in time (M = 0). To go over to fixed axes we should make 
use of Eq. (9.14), assuming them to be rotating in the opposite 
sense of the fixed axes with an angular velocity eH/(2mc). Hence 
the projections of the moment on the fixed axes are not constant* 
and the equation for it is 

^L + coXM = 0 

Substituting co according to Larmor’s theorem, we obtain 

TT=-2^“XM-eXB (17.31a) 

or 

^ = < i7 - 3i> > 

If in a system without a magnetic field the angular momentum 
is not conserved, then (17.316) can be referred to p. 

By first multiplying Eq. (17.316) scalarly by t p, we find that 
d\i 2 ldt = 0, so that p 2 = constant. We then multiply scalarly by H 
and find that (d/dt)(ix- H) = p | H | (d/dt) cos'd = 0, (d is the 
angle between p and H). Thus, the magnetic moment retains its 
absolute value and rotates about the magnetic field at a constant 
angle to it. It is this motion that is called precession. 

Note that a system of like charges, the resultant magnetic mo¬ 
ment of which is zero, also begins to rotate in a magnetic field. 

We shall now pass from the Lagrangian (17.27) to the Hamil¬ 
tonian of the system. From (14.25) it follows that if the energy of 
a charged particle is expressed in terms of its velocity, then the 
magnetic field is eliminated, leaving only the scalar potential. 
But since expression (14.24), which relates the velocity and mo¬ 
mentum, involves a vector potential, the Hamiltonian involves 
the magnetic field. Substituting the velocity expressed in terms of 
the momentum instead of the kinetic energy, we obtain the Hamil¬ 
tonian: 

(poc-f A( ri )) 2 +2^w a 7 - 32 > 

X X 

Assuming the magnetic field weak, and accordingly neglecting 
the square of the vector potential, we write the Hamiltonian as 

= i 2 |Ploi+-^rS KPoi * A (rj)) + (i - *)) 
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Since the magnetic field is already involved linearly in the term 
containing the product (p oi *A(rj)), p oi must simply be replaced 
by mv t . After a cyclic permutation of vectors we obtain 

^=4r2lp|oi-( H *f ji ) + 2 e< p( r i) ( 17 - 33 > 

i i 

Thus, in a magnetic field the Hamiltonian acquires a sup¬ 
plementary term —(H*jui) similar to —(E-d) in an electric field. 

Suppose now that the external magnetic field is not constant and 
is weakly nonhomogeneous in space, so that its variation over a 
distance of the order of the dimensions of the system of charges 
is not great. It then follows from the Hamiltonian (17.33) that 6 

2 Poi = F = =±- (H• ti) = grad (H •») (17.34) 

i 

A system possessing magnetic moment is subject to a force depend¬ 
ent on the nonuniformity of the field. The equation for this force 
can be transformed in the following way. Expanding (17.34) with 
the help of (11.32), we obtain 

F = (|Lt -v) H + |Li X curl H 

But for an external field curl H is zero, so that the force acting 
on a system of charges possessing magnetic moment is 

F = (jn-V) H (17.35) 

It manifests itself in the attraction of magnetized bodies to the 
poles of magnets, where the magnetic field is stronger. A similar 
expression for force is obtained for a system with an electric dipole 
moment. 


EXERCISES 

1. In the expansion of a vector potential, find the term due to the 
magnetic quadrupole moment. 

Solution . Write the factor in the second term of the expansion of 
| R — r | _1 due to one charge: 

1 _ 3 (r ■ R) 2 — r 2 i? 2 
' ’ R ~~ R* 

Then transform the products by parts: 

v (r-R)*=-^- r (r-R) 2 —2r (v-R)(r-R) 

6 Only the part of the force dependent upon the vector potential is cal¬ 
culated. 


15 * 
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and 

vr a = -^-rX r 2 — 2r (r*v) 

The total time derivatives, which vanish in the averaging process, are 
discarded in advance. Transform the left-hand sides of the equations as 
follows: 

v(r.R)*=-|v(r.R)»-|-r(vR)(r.R)=-|-[(rX y)X (r-R) R] 

▼r»= —|-[rX (fX ▼)] 

As a result the quadrupole term of the vector potential reduces to the form 

a 9 =w SK ri * v< ) x ( ri - v > « rad 

i 

Rewriting the obtained equation in tensor notation, we find that the 
quadrupole vector potential is defined by the following tensor: 

( ri X v %4- <?3P=° 

i 

„ (5 v xi? 2 -3X v Xx) 

{Aq) a ~ CaPvvpX ^5 


2. Study the motion of a magnetic moment jut in a magnetic field given 
by the components H z = —H 0i H x = Hi cos cof, H y = Hi sin cof. Con¬ 
sider the cases © = eH 0 l(2mc) and co —0. 

Solution. From (17.316) we obtain the general precession equation: 


dp 

dt 


= 2^ ><H 


By multiplying both sides of this equation by p scalarly, we see that p 2 
is conserved. It is, therefore, sufficient to write the equations only for the 
components p x and p y , replacing then p z by (p 2 — p 2 — pj) 1 / 2 . 

Using the abbreviated notation co 0 = eH 0 f(2mc) and ©i = eHi/(2mc ), 
we multiply the equation for \i y by +i and combine with the equation for 
P* t0 S et 

-^-(H*± l >») = ±'“o(f i a:± *>») ± (p.* — — (l *) 1/2 

We seek the solution in the form p x + i\i y = A ±e ±%iat 9 and get the 
following equation for the amplitudes A±: 

(co —c oq) A ± = cot (p 2 —A + A_) 1/2 

Multiplying the equation with A + in the left-hand side by the equation for 
A_, we obtain 

a A | a I- P* 

** ( 0 ) —( 0 o) 2 + ( 0 ? ’ ± [(to—o>o)* + tof ] 1/2 
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When (0 = (0 0 (paramagnetic resonance), the moment rotates in the 
x, y-plane with frequency co 0 . When co —0, that is, in the case of an infi¬ 
nitely slow rotation of the field, the moment strictly follows the field- 
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PLANE ELECTROMAGNETIC WAVES 

Solution of the Wave Equation. All the results of electrostatics and 
magnetostatics can be obtained without the help of Maxwell’s 
equations, on the basis of the fundamental laws of Coulomb and 
Biot-Savart. The essential innovation which could not have preceded 
Maxwellian physics is the concept of electromagnetic waves. They 
propagate in vacuum, in the absence of charges, and are obtained 
as special solutions of Maxwell’s equations. These solutions not 
only led to the conclusion concerning the electromagnetic nature 
of light. They were the basis for predicting the existence of other 
electromagnetic waves, both shorter and longer than light waves, 
notably radio waves. 

In the absence of charges or currents, Eqs. (12.43) and (12.44) for 
scalar and vector potentials are written thus: 

v2A -^=° . < i8i > 

v*0— £"f £=0 (18.2) 

with the additional Lorentz condition (12.42): 

divA + -t-^ = 0 (18.3) 

Equations (18.1) and (18.2) are called wave equations. Such equa¬ 
tions do not have static solutions at all. Indeed, if the second time 
derivatives of the potentials are put equal to zero, what remains is 
V 2 A = 0, V 2 <p = 0. But if these equations are satisfied over all 
space, their solutions can only be constant quantities. This is seen 
from Exercise 1, Section 16. 

Let a certain function / satisfy the equation V 2 / = 0 over all 
space and never tend to infinity. There is no point at which it can 
have a maximum. Assuming the reverse, the maximum point could 
be surrounded by a small sphere on which the function assumes only 
smaller values than at the centre, which contradicts the result of 
the exercise mentioned above. Consequently, nowhere does function / f 
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or any of its derivatives, which satisfy the same equation, have 
a maximum, and they are everywhere finite. Only a constant quanti¬ 
ty can possess such a property. 

Thus, wave equations have only nonstationary solutions valid 
over all space. We shall look for such partial solutions of (18.1) 
and (18.2) which depend on one Cartesian coordinate, for example x , 
and on time. Such solutions are, apparently, subject to the following 
equations: 


d 2 A 1 d 2 A 

dx 2 c 2 dt 2 -u 

d\ 1 d 2 (p p 

~d^"^~'c 2 "~dt 2 

and the supplementary condition 

Mx | 1 dq> _ q 

dx ^ c dt 


(18.4) 

(18.5) 

(18.6) 


We shall find the most general solution of this set of equations. 
We temporarily introduce the following notation: 

x + ct = x — ct = r\ (18.7) 

We transform (18.5) to these independent variables £ and ip 
Equation (18.4) can be rewritten symbolically as 


Then 


/ d . 1 d \ f d ld\ n 

\'d^+T~dr) «p — 0 


d<p _ dy • , <9cp dr\ _ dy , dip 

dx d | dx dr] dx ' dr] 

1 d(p dq> 1 d£ , dcp 1 dv\ _ dq> 

c dt c dt ‘ dr\ c dt d £ 


d(p 

~dy\ 


(18.8) 


because, for constant t (dt = 0), d\!dx = 1 and dx\!dx = 1, while 
for constant x (dx = 0), (1 lc)(d\ldt) = — (l/c)(dr\/dt) = 1 in 

accordance with the same equations. Thus symbolically 


A _1 d __q d 

dx ' c dt ’ dx c dt dy\ 

so that in terms of the variables £ and T] Eqs. (18.4) and (18.5) are 
written in the form 


d 2 A 
dl dr i 


o, 


d 2 (p 


0 


Integrating any of them with respect to we obtain 


(18.9) 
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It is not difficult now to integrate with respect to r\: 

T1 T| 

A = j C (ti) dr] + C t (1), <p= J C' (ti) dr] + Cl (|) (18.106) 

But the integrals of the arbitrary functions C (r]) and C' (r]) with 
respect to r\ are essentially new arbitrary functions of that same 
variable r\ y so that finally the required solutions in terms of the 
variables £ and t] have the form 

A = A x (t))-j- A 2 (£), <P = <PiOn) + (p 2 (£) (18.11) 

It is immediately apparent that substitution of such solutions into 
(18.9) identically yields zeros. 

Returning to the variables x and t, we can rewrite the obtained 
solution as 

A = Ai (x — ct) + A 2 {x + ct) 

(p == (p ± (x — ct) + <p 2 (x + ct) (18.12) 

It contains two arbitrary functions for each of the Eqs. (18.4) 
and (18.5), so that it is a general solution. 

Travelling Plane Waves. The solution depending on x — ct is not 
connected with the solution whose argument is x + ct; these are 
two linearly independent solutions. It is therefore sufficient to 
investigate one of them: 

A = A(x — ct) (18.13) 

cp = cp(x — ct) (18.14) 

In order to satisfy the supplementary condition (18.6), we perform 
a gauge transformation: 


q) (x — ct) = q/ (x — ct) —^ / (x — ct) = q/ + / (18.15) 


(the dot over / denotes differentiation with respect to the argument 

r\ = x — ct). But if we put q/ = —/, we obtain simply <p = 0. 
Then from (18.6) we also obtain A x = 0. Thus, for a solution of the 
form considered, depending on x — ct only, the Lorentz condition 
is satisfied most simply by substituting q) = 0, A x = 0. 

The electric field component along x is equal to zero: 


1 f dA x dqp _ Q 

ic dt\ dx 


(18.16) 


Since the field does not depend on the gauge transformation of 
the potentials, the result is a general one. 
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The magnetic field component along x is also equal to zero: 


ZJ dA Z dAy n 

x ~~dy dT~ V 


We find the remaining field components: 


\ dA„ • 

E — _i_L — A 

tLy ~ C di ~ Ay ' 


p _ 1 dA z 
El c dt 


(18.17) 


= ~A Z , 


(18.18) 


From this it follows that E and H are perpendicular, because 


EH = E v H y + E Z H Z = 0 (18.19) 

They are equal in absolute magnitude, since | E | = | H | = 
= (M„l 2 + lizl 2 ) 1 ' 2 . 

The equality and perpendicularity of the electric and magnetic 
fields are essentially invariant properties of the obtained solution. 
Indeed, from Eqs. (15.3) and (15.4), | H | 2 — | E | 2 and EH are 
relativistically invariant quantities. If | E \ = | H | and EH = 0 
in one reference frame, the same equations hold in any reference 
frame. The solution of the form (18.13) has a simple physical meaning. 

Let us take the value of E at an instant t = 0 on the plane x = 0. 
It is equal to E (0). It is clear that E (0) will have the same value 
at the instant of time t on the plane x = ct, because E (x — ct) = 
= E (0) on that plane. We can also say that the plane on which 
the field E is equal to E (0) is translated in space through a distance 
ct in the time t , that is, it moves with a velocity c. The same applies 
to any plane x = x 0 , for which there was some value of field E ( x 0 ) 
at the initial instant of time. To summarize, all planes with the 
given value of field propagate through space with the velocity c. 
Therefore, the solution E (x — ct) is called a travelling plane wave . 

We note that the form of the wave does not change as it moves; 
the distance between planes x = x x and x = x 2 , for which E is equal 
to E (zi) and E ( x 2 ), is constant. This result holds for any arbitrary 
form of wave travelling in free space. 

Repeating, the velocity of propagation of a wave in vacuum does 
not depend on its shape or amplitude and is equal to the universal 
constant c. 

The electric and magnetic fields, as we have seen from (18.19), 
are perpendicular to the direction of wave propagation, as well as 
to each other. This is why it is said that electromagnetic waves are 
transverse (as opposed to longitudinal sound waves in air, for which 
the oscillations^ occur in the direction of propagation). 

If we denote a unit vector in the direction of the propagation of 
the wave by n and plot it along the x axis of a right-handed coordi- 
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nate system, the electric field will be directed along the y axis, and 
the magnetic along the z axis. These directions correspond to the 
thumb, index and middle fingers of the right hand. We will always 
describe electromagnetic waves using a coordinate system in which 
the electric field coincides with the y axis. 

Since the vector n is directed along the x axis, we can write x = 
= rn, and the electric field can be represented in the form 


E y = A y (rn — ct) (18.20) 

But in this notation it is no longer necessary to relate vector n 
to the x axis. A solution with an argument of the same form as in 
(18.20) is applicable to any direction of n, provided only that n, E 
and H are mutually perpendicular. 

We shall now find the mechanical integrals of motion for an 
electromagnetic wave. 

According to (15.25), the momentum density of an electromag¬ 
netic wave is 


4r(EXH), * 4 


(18.21) 


and the energy density, from (15.24d), is 


IE | 2 +1 H | a 

Sn 


4n 




(18.22) 


It differs from the momentum density by the factor c. That is pre¬ 
cisely the property of the energy and momentum of a particle of zero 
mass, as in Eq. (14.13). This circumstance is extremely important 
for the quantum theory of light (see Part III). 

The density of the energy flux of an electromagnetic field is expres¬ 
sed by Eq. (15.26). From it, the Poynting vector of a plane electro¬ 
magnetic wave is equal to 

-£EXH = -g-ij (18.23) 


which agrees with the energy-density expression (18.22). 

The spatial part of the energy-momentum tensor of an electro¬ 
magnetic field can be used to calculate the pressure exerted by a 
plane electromagnetic wave. From the general formula (15.27) it is 
apparent that, when only E y and H z are other than zero, there remains 
one component T xx , which is equal to 

+ + + = (18.24) 


This component represents the momentum along the x axis crossing 
a unit area perpendicular to the axis in unit time. If an incident 
wave is normal to an absorbing barrier, the whole momentum is 
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transferred to the barrier. But according to Newton’s Second Law 
(see (1.1)), the momentum transported in unit time equals the force 
exerted on a unit area of a barrier normal to it. Hence, T xx repre¬ 
sents the pressure exerted by an incident normal electromagnetic 
wave on an absorbing barrier. We find that the pressure of a plane 
wave is equal to the density of its energy. 

This prediction of electrodynamics was experimentally confirmed 
by P. N. Lebedev. It was demonstrated that an electromagnetic 
field can indeed be treated as a mechanical system, as is done in 
Section 15 of this book. 


Harmonic Waves. Special interest is attached to travelling waves 
for which the function E (x — ct) is harmonic. The most general 
harmonic solution is of the following form: 

E = Re (18.25) 

where the symbol Re denotes the real part of the expression inside 
the brackets, F is a complex vector of the form Fi + iF 2 (cf. (7.14c)), 
and (o is the wave frequency in the same sense as in (7.3); co is the 
number of radians per second by which the argument of the expo¬ 
nential changes. 

The vector con !c is called the wave vector . It is denoted by k: 


k = 




(18.26) 


The geometric meaning of k is easy to explain. We define the 
wavelength, that is, the spatial distance Ar over which E reverts 
to the same value. Let the required wavelength be X. Then 


gicDX/e = gico(n* Ar )fc. __ ^i|Ar||k| = g2ni 


(18.27) 


because the period of the function e xx 


is equal to 2 jx. Hence 


X 


2nc 

co 


(18.28) 


Comparing the wavelength with the wave vector, we obtain 

k = ^-n, * = 1 |L (18.29) 

Sometimes a quantity smaller than the wavelength by a factor 
of 2ji, and denoted X, is used. It is equal to the inverse value of the 
absolute magnitude of the wave vector. 

Let us see how the frequency and wave vector of a harmonic 
electromagnetic wave transform in passing from one inertial frame 
of reference to another. We shall show that the transformation prop¬ 
erties of the components of the wave vector and the frequency are 
the same as those of the coordinates and time. 
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We shall introduce the concept of wave phase as an argument of 
the exponential function (18.25) and prove that phase is an invariant 
quantity. Indeed, phase characterizes a certain event, say the vanish¬ 
ing of an electric and magnetic field at some instant of time at some 
point in space. If that same wave is considered in another reference 
frame, the coordinates and time corresponding to that event will 
have other values, but the event itself, the fact, will not, of course, 
have changed. 

This is easily understood by imagining an electric field being 
measured according to the readings of some noninertial instrument. 
Two such instruments superimposed at some instant at the same 
point in space but having a relative velocity of motion must show 
zero reading for the field, otherwise the reference frame in which 
the electromagnetic field is zero will be in some way preferred in 
comparison with the others. For example, a light-sensitive plate 
exposed at a given instant would fail to darken in that frame of 
reference. 

According to (18.25), the expression for the phase of a wave is 
\|) = kr — <at = x 1 k x -\-x 2 ky-\-x z k z -\- x k -y- (18.30) 

For it to be invariant the wave vector and frequency must be 
assumed to constitute together a single four-vector: k x = k y = 
= Zc 2 , k z = k 3y ico/c = Zc 4 . Then the phase is written as \j) = k t Xi . 

The four-vector k t possesses a peculiar property, which is apparent 
from the second, scalar equation (18.26): 

|kp—£ = *,*» = *? = 0 (18.31) 

In other words, it is said that k t is a zero vector, that is, a vector 
of zero four-dimensional length. Unlike three-vectors, which can 
have zero length only when all three components vanish, four-vectors 
do not possess this property. 

Applying the general formulas (13.33) and (13.34) for the transfor¬ 
mation of the components of a four-vector, we obtain 


*;+©'V/e* 

,< kx—aV/c* 

(18.32) 

* (1 — F 2 /c 2 ) 1/2 ’ 

* (1 _V2 /c 2)1/2 

(o' + VJfc; 

H 

1 

3 

e 

(18.33) 

(1 — 7 2 /c 2 ) 1/2 ’ 

(1 — 7 2 /c 2 ) 1/2 


As usual, the projections k y and k z , which are perpendicular to the 
relative velocity of the reference frames, do not change. 

Suppose now that a light source is at rest relative to the primed 
reference frame, that is, it is moving with velocity V along the x 
axis of the unprimed, “stationary”, system. A stationary observer 
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measures the frequency of a light beam travelling at an angle 0 
to the x axis. Then, from (18.33) and (18.26), and substituting k x = 
= (cd/c) cos 0, we find the change in the frequency of the moving- 
light source as compared with its frequency in its proper reference 
frame: 

, to [1 — (V/c) cos 8] 

(1 — VVc*) 112 

If 0 < ji/2, then co > (o'. 

This formula describes the well-known Doppler effect, which is 
used to measure the ray velocities (that is, directed strictly along 
the line of vision) of celestial bodies. To the accuracy of the term 
linear in V/c , the effect is obtained in nonrelativistic theory from 
elementary kinematic considerations. 

The significance of the relativistic formula is especially apparent 
when the source is moving perpendicular to the line of vision. Then r 
in the second equation in (18.33) we must put k x = 0, obtaining thus 

co = co' (1 - V 2 /c 2 )'t 2 (18.35) 

This formula expresses the transverse Doppler effect. It is directly 
associated with the relativistic time dilation described by Eq. (13.21a). 
A moving emitter of harmonic waves can be treated as a clock, 
insofar as a periodic process with a frequency co' takes place in it. 
The total number of oscillations is an invariant quantity: every 
oscillation is an event. Since a moving clock shows less time to have 
passed, the oscillation frequency must be correspondingly greater. 

The transverse Doppler effect has been observed spectroscopically 
with atoms in motion (Ives-Stilwell experiment, 1938). For this 
experiment the ratio V/c was sufficient to observe the frequency 
shift. This offered direct experimental proof of time dilation in 
relative motion. 

Polarization of a Plane Harmonic Wave. Let us now study the 
nature of the oscillations of an electric field in a plane harmonic 
(otherwise called “monochromatic”) wave. For this, we write the 
vector F (see (18.25)) in the form 

F = F i + iF 2 = (E t — iE a ) (18.36) 

E: We choose the phase a so that the vectors Ei and E a are mutually 
perpendicular. We multiply Eq. (18.36) by e~ ia and square. Then 
we obtain 

(E, — iE 2 ) 2 = | Ej | 2 —| E 2 1 2 

= e- 2 * [| Fl p — | F 2 1 2 + 2 i (F, • F 2 )] (18.37) 

We have taken advantage of the fact that E x and E a are perpendi¬ 
cular. Because of this (E x — iE 2 ) 2 is a purely real quantity. There- 


(18.34) 
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fore, the imaginary part of (18.37) must be put equal to zero. Repre¬ 
senting e~ 2ia as cos 2a — i sin 2a, we obtain 

— (| F x | 2 — | F 2 I 2 ) sin 2a + 2 (Fi • F 2 ) cos 2a = 0 
or 


tan 2a 


2(Fi.F 9j ) 

I F t I 2 —I F 2 I* 


(18.38) 


whence the angle a is determined for the given solution (18.25). 

It is now easy to express Ei and E 2 . By (18.36), E x — iE 2 = 
= (Fj + iF 2 ) e~ ia = FiCosa + F 2 s i na — i (F x sin a— F 2 cosa), so 
that 


Ei = Fi cos a + F 2 sin a, E 2 = F x sin a — F 2 cos a 

(18.39) 

We now include the constant phase a in the exponent of (18.25) 
and, for short, put 

a — a) (t — rn !c) = \|) (18.40) 

Then, in the most general case, the electric field for a plane harmo¬ 
nic wave will be 

E = Re [(E t — iE 2 ) e **] = E 4 cos + E 2 sin ip (18.41) 

Here, the vectors Ei and E 2 are defined as perpendicular. 

Let us now represent the solution of (18.41) graphically. Let the 
wave propagate along the x axis. The y axis is directed along Ei, 
and z axis along E 2 . Hence, from (18.41) we obtain 

E y = | Ei | cos \|), E z = | E 2 | sin \|) (18.42) 

We eliminate the phase \j). For this divide the first equation by 
| Ei |, the second by | E a |, square and add. Then the phase is elimi¬ 
nated and an equation relating the field components remains: 


E 2 
v 

I Ei I 2 


El 


(18.43) 


It follows that the electric field vector describes an ellipse in the 
y,z-plane, which is itself moving along the x axis with velocity c, 
and passes around the whole ellipse over one wavelength. Relative 
to a fixed coordinate system, the electric field vector describes a helix 
wound on an elliptic cylinder. The pitch of the helix is equal to the 
wavelength. 

Such an electromagnetic wave is termed elliptically polarized. 
It represents the most general form of a plane harmonic wave (18.25). 

If one of the components is equal to zero, for example Ei = 0 
or E 2 = 0, then the oscillations of E occur in one plane. Such a wave 
is termed plane polarized . 
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When | Ex | is equal to | E 2 |, the vector E describes a circle in 
the y,z- plane. Depending on the sign of E 2 , the rotation around 
the circle occurs in a clockwise or conterclockwise direction. Ac¬ 
cordingly, the wave is termed right-hand or left-hand polarized . Fig¬ 
ure 25 shows the configuration of vectors Ei, E 2 , and E for a right- 
hand polarized wave and a left-hand polarized wave. For the same 
value of phase 'll?, the rotation is either clockwise or counterclockwise. 



Figure 25 

The sum of two circularly polarized waves of equal amplitude 
gives a plane polarized wave. The relationship between their phases 
determines the plane of polarization. Thus, if the waves shown in 
Figure 25 are added the oscillations E 2 and — E 2 mutually cancel 
out, and only the plane polarized oscillation E remains. In turn, 
a circularly polarized oscillation is resolved into two mutually 
perpendicular plane oscillations. 

In nature it is most common to observe unpolarized (natural) 
light. Naturally, such light cannot be strictly monochromatic (that 
is, possessing strictly one frequency o), for, as we have just shown, 
monochromatic light is always polarized in some way. But if we 
imagine that the components Ei and E 2 in Figure 25 are not related 
by a strict phase relationship (18.42) but randomly change their 
relative phases, then the resultant vector will also change its direc¬ 
tion in the y,z-plane in a random manner. For this, it is necessary 
that the oscillation frequencies should vary within some finite 
interval Ao, since the difference of phase between two oscillations 
of strictly constant and identical frequency is constant. 

Natural light scattered on charges may be polarized for certain 
directions of the scattered beam. 
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EXERCISES 


1. Show that the superposition of two waves of equal amplitude, cir¬ 
cularly polarized in opposite directions and travelling in the same direction, 
the wavelength difference between them being AX, produces a plane polar¬ 
ized wave whose polarization vector rotates as the wave propagates. 

2. Find the relationship between angles 0 and 0' defining the inclination 
of a light beam to the direction of the relative velocity of two reference 
frames. 

Solution. Substituting k K = (c o/c) cos 0, k' x = (c o'/c) cos 0 into the 
first equations of (18.32), and taking into account that k y = (co/c) sin 0, 
ky = (c o'lc) sin 0', k y = k' yy we find 


tan 0 = 


sin 0' 

cos 0' -\-V/c 


(1 —F 2 /c 2 ) 1/2 


whence 


COS0 = 


cos 0' -f V/c 
1 + (V/c) sin 0' 


which agrees with the velocity addition formula (13.23). 

In applying the obtained relationship to the phenomenon of light 
aberration, we must put 0' = ji/ 2 for the rays of a star passing perpendicular 
to the plane of the earth’s orbit. Then in a reference frame fixed relative 
to the earth we obtain a constant angle of inclination of the ray to the per¬ 
pendicular to the plane of the orbit: 



In the earth’s annual motion the respective star describes a circle of angular 
radius Vic. Taking into account that the star is at a finite distance from the 
sun, its parallax is additionally superimposed on this angular displacement. 

3. Prove that the quantity co 2 dQ is relativistically invariant. 

Hint. Make use of Eq. (18.34) and the result of Exercise 2. 
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TRANSMISSION OF SIGNALS. 
ALMOST PLANE WAVES 


The Impossibility of Transmitting a Signal by Means of a Monochro¬ 
matic Wave. A plane monochromatic wave (18.25) extends without 
limit in all directions of space and in time. Nowhere, so to speak, 
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does it have a beginning or an end. What is more, its properties are 
everywhere always the same: its frequency, amplitude and the 
distance between two travelling crests (that is, the wavelength X) 
are always constant. All this can be easily seen by considering 
a sinusoid or helix. 

Let us now pose the problem of the possibility of transmitting 
an electromagnetic signal over a distance. In order to transmit the 
signal, an electromagnetic disturbance must be concentrated in 
a certain volume. By propagation, this disturbance can reach another 
region of space; detected by some means (for example, a radio receiv¬ 
er), it will transmit to the point of reception a signal about an 
event occurring at the point of transmission. Likewise, our visual 
perceptions are a continuous recording of electromagnetic (light) 
disturbances originating in surrounding objects. A signal must 
somehow be bounded in time in order to give notice of the beginning 
and end of any event. 

In order to transmit a signal the amplitude of the wave must, 
for a time, be somehow changed. For example, the amplitude of 
one of the waves of the sinusoid must be increased and we must 
wait until this increased amplitude arrives at the receiving device. 
A strictly monochromatic wave, that is, a sinusoid, has the same 
amplitude everywhere and is therefore not suitable for transmitting 
signals. In the same way, an ideal plane wave with a given wave 
vector cannot transmit the image of an object limited in space. 

Propagation of a Nonmonochromatic Wave. Let us now see how 
the superposition of several sinusoids, that is, monochromatic waves, 
can be used to transmit signals. Suppose that we have at our disposal 
a frequency spectrum of travelling waves lying within the interval 
(o 0 — Ag)/2 ^ (o ^ (D 0 + A(d/2, such that the total width of the 
spectrum, that is, the frequency interval A(d, is considerably smaller 
than the carrier frequency o) 0 . For the sake] of simplicity, the ampli¬ 
tudes of all the waves will be assumed to be identical and equal to 
E 0 ((d) = E 0 within the chosen spectrum, vanishing outside that 
interval. 

Then the resulting oscillation will be represented by the integral 
over all the separate monochromatic oscillations: 

E = j E 0 ((o) e~ ii<ot ~ kx) da 

g)0+Ag)/2 

= E 0 J e -**t-kx) fa (19.1) 

©o-Aco/2 

In this equation not only the frequency is variable, but also the 
absolute value of the wave vector k, the so-called wave number. 
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According to (18.26) it is equal to (o/c, but here it is more convenient 
to consider a dependence of more general form: k = k (co). 

Since the frequency lies within a small interval, k can be expanded 
in a power series in (co — co 0 ): 


&(co) = fc((D 0 ) + (o> — (O 0 ) (-^-) 0 


(19.2) 


Substituting into (19.1), we obtain the following expression for 
the field: 


©0+A(0/2 


E = E 0 e _i(<flot “ ftox) j e~ { 


- i(o) - to o)[t - (dfc/d(0 )ox] 


(Do-A(0/2 


(19.3) 


We now introduce a new integration variable | = co 
Evaluating the integral, we reduce it to the form 


E = ’E 0 e~ i ( tii ot-hox) ^ e -il[t-(dh/dto)ox] ^ 

— A(i)/2 

= p e-i«»ot-hox) v 2 sin {[t-(dk/dv) 0 x\ (Aco/2)} 
0 * t — (dk/da*) 0 x 


(19.4) 


Let us now examine the expression obtained. It consists of two 
factors. The first of them, represents a travelling 

wave homogeneous in space with a mean carrier frequency (o 0 . How¬ 
ever, the amplitude of the resultant wave is no longer constant in 
space because of the second factor 

2 ‘ in "'^:r ,2 » -<■ £}-*<*> 

where the designations g and % are obvious from the equation. Thanks 
to the sine, this factor has an infinite number of maxima. One of 
them, however, is the greatest and is attained when the argument 
of function % vanishes. The other maxima are smaller and decrease 
with distance from the principal one, which is located at x = 
= (dto/dk) 0 t. It can thus be observed that this maximum has no 
constant position in space and itself moves with a velocity 

v=% (19.5) 

because it follows from the definition of the maximum point % = 0 
that 

day . . 

x =-—t = vt 
dk 

As mentioned at the beginning of this section, the displacement 
of the maximum can be applied for transmitting signals from one 


16-0452 
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spatial point to another, because this maximum is distinguished 
from the other maxima. Such a spatially concentrated disturbance 
is called a wave packet. 

A wave packet need not necessarily have the form shown in Fig¬ 
ure 26, where it refers to the expression (19.4). By selecting a rela¬ 
tionship E 0 ((o) other than in Eq.(19.1), that is, one involving not 
a constant amplitude in the frequency interval Aco but a more 
complex function of frequency, the shape of E (x) can be changed. 
In particular, it is simple to make the resultant amplitude rectangu¬ 
lar, so that the transmitted signal resembles a Morse-code dash. 



The word “simple” here refers to the analytical definition of the 
quantity E 0 ((o) yielding a rectangular signal. Indeed, Eq. (19.1) 
is in fact an integral Fourier transformation from function E 0 ((o) 
of variable co to function g (%) of variable %. But the Fourier trans¬ 
formation possesses the property of reciprocity: if function E 0 ((o) 
is given the form corresponding to g (%) in Eq. (19.4), the output 
signal E 0 (x) will be rectangular. 

If the carrier frequency (o 0 is high enough, it can be used to trans¬ 
mit separate audio-frequency signals without their superimposing 
on one another. In other words, it is possible to reproduce music 
or speech without appreciable distortions. 

Frequency Range and Signal Duration. As we have seen, a certain 
frequency range is always required to transmit a signal. A mono¬ 
chromatic wave with a strictly definite frequency is uniform in time: 
it, as it were, transmits one signal of infinite duration. From the 
Fourier inversion theorem, which expresses the property of recipro¬ 
city of the integrals, we conclude that a signal of infinitesimal dura¬ 
tion requires an infinitely large frequency range for transmission. 
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But what if the duration of the signal is finite? What frequency range 
is needed to transmit it? 

This can be concluded from an examination of Figure 26, which 
presents the dependence of g on %. For the purpose of transmitting 
a signal, only the domain of the curve close to the principal maxi¬ 
mum at x = 0 is important. In units of % it is of the magnitude of ji. 
Hence, the duration of the signal is determined from the equation 

Ax = -^-Af ~ n 

In other words, to transmit a signal of duration A t the required 
frequency interval is Ao>, connected with At by the relationship 

AcoA£~2:rt (19.6a) 

It should be noted that this estimate refers only to the order 
of magnitude of Aco and At. Determination of Ax is to a certain 
extent arbitrary and done purely visually from the shape of the 
curve. 

The shape of the curve suggests that the main contribution to the 
transmission of the signal is made by small values of x> close to zero,, 
because around zero the amplitude of the wave is greater. From 
an evaluation of the frequency interval according to the relative 
contribution of the various frequencies to the transmitted signal, 
it is possible to develop a somewhat more strict evaluation than 
(19.6a), namely 

(Aco) (A t) ~ 1 (19.66) 

Here the quantities are in angular brackets to stress that the method 
of evaluation is different than in (19.6a). It must, furthermore, be 
pointed out that (19.6a) and (19.66) are lower estimates. In many 
cases the strong inequality Aco A t > 2n holds. 

If a radio station is required to transmit sounds audible tc tne 
human ear, then the quantity A t must not be greater than 0.5 X 
X 10" 4 s, since the limit of audibility is 2 X 10 4 oscillations per 
second. Actually, frequencies not higher than 0.5 X 10 4 are adequate 
for transmission. 

The frequency range Aco is always less than the carrier frequency 
co 0 , which, even for the longest-wave transmitting stations, is not 
less than 10 6 . The frequency co 0 must be compared with an interval 
Aco of the order of 0.5 X 10 4 , since cutting off the highest frequencies 
in music, singing or speech does not introduce any essential distor¬ 
tion. 

Television transmissions require a considerably greater frequency 
interval, because an image must be reproduced 25 times every second 
and, in turn, consists of tens of thousands of separate points. As 
a result, the carrier frequency must be very high, corresponding 

16* 
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to the metre waveband. Such waves propagate within a relatively 
small radius; they are screened by the curvature of the earth’s sur¬ 
face, just as light is. It is interesting that the transmission of colour 
images requires practically the same frequency range as black-and- 
white images. This is mainly due to the fact that man’s colour vision 
is not as sharp as his contour vision. 

Phase and Group Velocity. Let us consider in greater detail the 
speed with which signals are transmitted. If we apply formula 
(19.5) to the propagation of a signal in vacuum, we obviously obtain 
v = c. The situation is different in a nonabsorbing material medium, 
where the dependence of k on co should be assumed different than 
in vacuum (the situation is greatly complicated in an absorbing 
medium). The phenomenon is known as dispersion of electromagnetic 
waves. 

We shall not investigate it here and simply accept that between k 
and (o there is a dependence which is not a direct proportionality. 
This is extremely important for the optico-mechanical analogy with 
which Section 21 deals (see Part III). Thus, we take the velocity 
of a wave packet as being 


v = 


di o 

Ik 


It differs from the propagation velocity of the constant phase 
surface, which is expressed in terms of frequency and wave number as 


u = 


(19.7) 


Indeed, the expression for a travelling monochromatic wave can 
be written in the following form: 

E = E 0 e ih ( x ~ 

Comparing this formula with the general expression for a travelling 
wave E = E (x — ut ), we arrive at (19.7). The velocity of the wave 
is designated u rather than c, because (19.7) refers to the propagation 
of the wave not in vacuum, where the difference between u and c 
vanishes, but in a nonabsorbing medium; u is the phase velocity of 
the wave and v the group velocity of the wave packet resulting from 
the superposition of a group of waves. 

The group velocity can also be defined in vector form: 

d( * (19.8) 


V = 


dk 


thereby defining the direction of the signal. 

The Form of a Wave in Space and the Range of the Wave Vectors. 
An expression similar to (19.6) can also be obtained for the form 
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of a wave in space at a definite instant of time. For this we must 
take x at some constant time t = constant and then, once again 
taking A% ~ n f we obtain 


a Aco dk A A k At 


n 


or 


A k Ax ~ 2n 


(19.9) 


For (A k) and (Ax) we can write a formula similar to (19.66). 

In view of the fact that k is a vector quantity, relationship (19.9) 
should be written for all three components of k. Then instead of (19.9) 
we arrive at three estimates: 


A k x Ax ~ 2ji 
A k y Ay ~ 2ji 

A k z Az ~ 2j x (19.10a) 


We shall explain the relations (19.10a) by means of a graphic 
example. Let us suppose that an electromagnetic wave has, in some 
way, to be bounded on the sides, as in the case of a radar beam. 
Let us find the greatest accuracy with which a radar can register 
the position of an object at a distance l. Obviously, this accuracy 
is given by the diameter of the beam d at the distance l from the 
radar. 

Let the frequency at which the radar operates be co, the correspond¬ 
ing wavelength then being X = 2jic/(o. If the electromagnetic wave 
were propagated in unbounded space, it would have (or could have) 
an accurately defined wave vector 


k = 


2ji 


(n is the unit vector in the direction of the beam). If the wave has 
a cross section d, then k can no longer be regarded as an accurately 
defined vector along n. 

In order to write an expression for the electromagnetic wave at 
any point in space occupied by the beam, it is necessary to take 
a group of plane waves whose vectors k lie inside a cone described 
by a certain aperture angle. We shall assume that the axis of the 
cone coincides with n. We have in mind here not a cone with 
a sharply defined surface but a conic bundle of directions. The de¬ 
pendence of the wave’s amplitude on its direction of propagation 
inside the cone can be, for example, like the curve in Figure 26 close 
to the principal maximum. 

The aperture angle of a conic beam is defined by a certain value 
of k ± in any direction perpendicular to the axis of the cone; k ± defines 
the interval of values, Ak, necessary for the diameter of the bundle 
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of beams in space to be d. It is apparent that in meaning k L and d 
represent hk y /2 and A y, or A& z /2 and A z in the second and third 
formulas in (19.10a), if n is directed along the x axis. We thus 
arrive at the relationship 

2 k L d > 2ji (19.106) 

(A k y should be put equal to 2 k ± , because the divergence from n is 
in both directions). 

The dimensions of the radar antenna itself can be ignored if the 
diameter of the beam is considered at a great distance; and this is 



of practical interest. In other words, d is determined only by the 
relationship (19.10a) and is independent of the dimensions of the 
antenna. 

The divergence of the beam at every point is measured by the 
ratio k L lk. For this reason, the ratio of the cross section of the beam d 
to the distance from the radar l cannot be less than the quantity 2k Jk . 



2k 


_L 


This relationship is shown in Figure 27 for the limiting case of 
the equality, but it should be remembered that what we have here 
is, rather, the upper estimate of the order of magnitude of 2k Jk, 
Equation (19.106) is the lower estimate of this quantity: 


Knowing the upper and 
kjk from them, obtaining 


d_ 2n 
T^~dk * 


or 


and finally 

d > (a) 1/2 


lower estimates for Zc ± ,we can eliminate 
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For example, if Z = 100 km and X = 1 m, then the position of 
the object cannot be determined with an accuracy exceeding 320 m. 
This is why the dimensions of the antenna could be neglected. 


Limits of Applicability of the Ray Concept. Equations (19.10a) 
indicate within what limits the concept of a ray is applicable in 
optics. Obviously, one can talk about a ray in a definite direction 
only when 

A* < k (19.11) 

that is, when the transverse broadening of the wave vector is consid¬ 
erably less than the wave vector itself. In the radar problem this 
means that k L k. But k L ~ n!d and k = 2jiA,, so that (19.11) 
is equivalent to the condition 

d» X (19.12) 


In other words, the dimensions of the region in which the concept 
of a light ray is meaningful must be considerably larger than the 
wavelength of the light wave. For example, a small aperture in the 
wall of a camera-obscura of diameter, say, 1 mm is considerably 
greater than the wavelength of visible light, which is of an order 
of 0.5 X 10~ 4 cm. Therefore, the image obtained in a camera-obscura 
is formed with the aid of light rays. 

The optics of light rays is called geometrical optics. A ray is 
defined only when its direction is given, that is, the normal to the 
wave front. If we are given a beam of nonparallel (for example, 
converging) rays, then the wave front is curved. But the radius of 
its curvature at each point must be much greater than the wavelength 
for it to be represented close to that point with the help of an oscu¬ 
lating plane front. Then a convergent beam of rays is represented 
as the aggregate of normals to the corresponding plane wave fronts. 

Close to the focus of the optical system, where all the rays con¬ 
verge, the curvature of the wave front may become comparable with 
the wavelength, and then deviations from geometrical optics occur. 
They are known as wave diffraction. They are also observed when 
a wave falls on an opaque obstacle. In accordance with geometrical 
optics, there should be a sharp shadow—a transition from a domain 
where the field is not zero to the domain where it is zero. But Max¬ 
well’s equations involve derivatives of fields with respect to coordi¬ 
nates and do not permit discontinuous solutions in free space. 
Actually there is always a transition zone between “light” and “shade” 
in which the wave amplitude changes in a complex, nonmonotonic 
way. 
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EXERCISES 


1. 

A 


Write the relationship between phase and group velocity. 


dm duk . . du 

nswer. v = — = —— = u k — = 
dk dk dk 


du 

K dX' 


2. Show that if the dependence of the amplitude of a monochromatic 
wave in a wave packet is proportional to a function of the form 

exp [(g) — g) 0 ) 2 /(Aco) 2 ] 

where g) 0 is the carrier frequency, and Ag) <C g) 0 characterizes the spectrum 
width, then the form of the wave packet reproduces an analogous function, 
but of t — x/u, a function such that the packet width is inversely propor¬ 
tional to Ag). Compare this with (19.6a). 

3. Find the limiting size of an object that can be seen under a micro¬ 
scope, using light of wavelength X. 

Solution . Denoting one-half the aperture angle of the cone of the rays 
from the lens to the object as 0, we have A k = k sin 0. From this 
^ 2n __ 2n _ X 
X ^~Kk~~ k sin 0 “ sin 0 

Thus, it is best to use beams with large aperture angles and short wave¬ 
lengths. 
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THE EMISSION 

OF ELECTROMAGNETIC WAVES 


Basic Equations and Boundary Conditions,. So far we have considered 
electromagnetic waves irrespective of the charges producing them. 
In this section we shall consider the emission of waves by point 
charges moving in a vacuum. The basic system of equations in 
this case is (12.43) and (12.44) together with the Lorentz condition 
(12.42). We rewrite these equations anew: 


V 2 A— 


d 2 A 

dt 2 



V 2 q>-^r- 

div A + ~ 


d 2 <p 

~dW 

dtp 

nr 


— —4np 
= 0 


( 20 . 1 ) 

( 20 . 2 ) 

(20.3) 
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Equations (20.1) and (20.2), written for the Cartesian components 
of the vector potential and for the scalar potential, are of the same 
form (the Laplacian of a vector differs from the Laplacian of a scalar 
in curvilinear components, for example A r , A e , Ay). Therefore we 
seek the solution of one of these equations and apply it to all the 
others. 

It is convenient to proceed as follows: find the field of one point 
emitter and then sum over all emitting charges, just as in Section 16 
we first obtained the static field of one point charge. By virtue of 
the linearity of Eqs. (20.1)-(20.3), the solution for an arbitrary 
distribution of charges and currents is the sum or integral taken 
over all the emitters. 

Suppose that a charge fie, equal to p dV , is placed in a volume 
element dV at the origin of a coordinate system. Let us look for 
the corresponding potential. 

In order to determine the solution uniquely we must impose 
a certain boundary condition on it. Let us assume that the charges 
are located in infinite space free of matter, that is, that there are 
neither conductors nor dielectrics in the vicinity. Then the boundary 
condition can be imposed only at an infinite distance from the 
charges. In accordance with the posed problem, it is natural to 
assume that there was no radiation field at an infinitely long time 
before the emission began and at an infinite distance from the emitter: 

qp(£ —— oo, r — *oo) = 0 

A (t — >■— oo, r —oo) = 0 (20.4) 


This allows for a unique solution of the problem. 

Since charge 8e is placed at the origin of the coordinate system, 
it is natural to seek a solution possessing spherical symmetry.Then 
in the equation for qp we must go over to spherical coordinates and 
differentiate only with respect to r, as in the electrostatic problem, 
and also, of course, with respect to time. 

We now write Eq. (20.2), leaving only the derivatives with respect 
to r and t. We assume that it refers to any point except the origin, 
in which case in the right-hand side we have zero: 


J_d 2 _L L = n 

r 2 dr dr c 2 dt 2 


Temporarily, we put 


Then 


<P 


<P(r, t) 

r 


dy 1 <90 (p 

dr r dr r 2 1 


d 2 _ d 2 Q 

dr ^ dr ^ dr 2 




d® d® __ d 2 Q 
dr dr ** dr 2 


(20.5) 

( 20 . 6 ) 
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Substituting this into (20.5) and multiplying by r (by definition r 
is not equal to zero), we obtain 


d 2 Q _1 d 2 Q 

dr 2 c 2 dt 2 


(20.7) 


But this is the equation, of the form (18.5), for the propagation 
of a wave. Its solution is similar to (18.12): 

0 = <D 1 + +0 2 (*— r 7 ) (20.8) 

We now apply the boundary conditions (20.4). The solution Ox 
depends on the argument t + r/c, and the solution ® 2 depends on 
the argument t — r/c. The first of these arguments, t + r/c, for 
r ->* oo and t —— oo has a completely indeterminate form oo — oo, 
that is, it is equal to anything. From (20.4) the function ® becomes 

zero as r ^oo and t —> oo (r in the denominator of (20.6) is 

immaterial, as will become apparent from the subsequent discourse). 
Therefore vanishes for any value of the argument, that is identi¬ 
cally. For the function 0 2 conditions (20.4) denote that ® 2 (—oo) = 
= 0. In other words, 0 2 tends to zero at minus infinity. It does not 
follow from this, of course, that it is equal to zero everywhere. Thus 

d>=o 2 (*—l) 

Omitting the 2, we write the expression for cp as follows: 

«P = 1<D (*_!.) (20.9) 

The function O is not yet determined. From the form of its argu¬ 
ment we conclude that it describes a travelling wave in the direction 
of increasing values of the radius (because t > 0). Such a wave is 
termed diverging. 


Retarded Potential. The value of the function O at r = 0, t = 0 
is shifted to the point r in a time t = r/c. In other words, the poten¬ 
tial at point r and time t is determined by the charge, not at time t, 
but at an earlier time t — r/c. The term r/c is a measure of the retar¬ 
dation occurring as a result of the finite velocity of propagation of 
the wave. Of course, the change in charge be located at the origin 
of the coordinate system is due to the fact that some charges come 
to this point while others leave it; the charges themselves do not 
change. But for the time being our solution takes into account only 
the potential of those charges that are located at the origin. 

Very close to the origin, where the retardation becomes a very 
small quantity, the potential must be determined by the instan¬ 
taneous value of charge be(t) (see beginning of Section 16). 
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As was shown in Section 16, the potential of a point charge is 
equal to 8e/r (see (6.8)), whence 

q>(0 6 e{t) _ p (t)dV 


r r r 


Therefore 


O (t) = p (t) dV 


( 20 . 10 ) 


The retarded potential of a point charge, (p (t — r/c), is, in accor¬ 
dance with (20.9) and (20.10), 


<P (* —-f) - p{t ~ r/c) dV 


( 20 . 11 ) 


Now, shifting the coordinate origin to a different point, we obtain 
an expression similar to (16.9): 


<p (t_ f ) dV 


( 20 . 12 ) 


Here the charge density is determined at point r ( x , y, z), while 
the potential is calculated at point R (X, Y, Z ). Thus, the depen¬ 
dence of the charge density on the spatial coordinates is involved 
in this expression dually: directly, in terms of the argument r, as 
in the static problem, and in terms of the temporal argument 
t — | R — r |/c, owing to the fact that the retardation of the poten¬ 
tial at point R due to different charges of the system is different. 
Finally, in order to obtain the complete solution of Eq. (20.2), we 
must integrate (20.12) over all volume elements, that is, over dV = 
= dx dy dz : 

<p= [ p(t ~|p~ r r | |/c ’ r) dV (20.13) 

For point charges p denotes the special function that was defined 
in Section 12. It is equal to zero at every point except the one in 
which the charge is located at the given instant. 

Equation (20.1) has exactly the same form as (20.2), and its solu¬ 
tion satisfies the same boundary conditions. Therefore the vector 
potential is written analogously to (20.13): 

a = j i(t -|'g:;; /c ’ r) (20.14) 


This solution can be compared with (17.10), obtained without taking 
account of the retardation. 

Let us now verify that the solution of the set of equations (20.1) 
and (20.2) satisfies the Lorentz condition (20.3). The time derivative 
of the scalar potential is taken directly, because t is involved in the 



252 


Fundamental laws 


integral over F as a parameter: 


dcp 

dt 


dp 


|R- 


dt 


dV 


The divergence of A must be taken with respect to the argument R, 
appearing in the integrand as a parameter, hence it can be taken 
outside the integral sign; R is involved only in the combination 
| R — r |, so that Vr is replaced by — Vr- But div R cannot be 
directly replaced by —div r , which refers to the whole expression, 
because r is also the spatial argument of j. The following substitu¬ 
tion must be made: 


div R 


R-r| 


— div r 


div r j 


R-r| 


I R —r I 


The divergence of the whole expression (the first term) can be 
transformed according to the Gauss theorem into a surface integral. 
By choosing the surface outside the system of currents, we find 
that the integral vanishes. 

In the expression div r j, the differentiation is only with respect 
to the spatial argument. Adding (ilc)(dq>ldt) and div A, we get 

T^r + AivA = i c\R- T\(ijr+ div 'i) dv 

But in accordance with the charge conservation law the integrand 
is zero, so that the Lorentz condition is satisfied. 


Retarded Potential at Great Distances From a System of Charges. 
We shall now look for the form of the solutions of (20.12) and (20.13) 
at a great distance away from a radiating system. We note that 
the integrand depends on the argument R in both integrals in two 
ways: in the denominator and via the argument t — | R — r \/c. 
The function in the denominator depends very smoothly on R. Its 
expansion in terms of powers of R yields terms which decrease 
like R~ n at infinity. As will be shown later, they add nothing to the 
radiation for n > 1. So we simply replace | R — r | _1 by R~ l and 
take it outside the integral sign. At large distances from the system 
the term | R — r |, appearing in the argument t of the numerator, 
looks like this: 


| R — r | = R —(r-grad R) ==R - R — rn (20.15) 

where n is a unit vector in the direction of R. The subsequent terms 
of the expansion (20.15) contain R in the denominator and are 
insignificant. Thus, at a large distance from the radiating system 
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the potentials are: 

"’“tM'-t+'t-' r ) iv <20.16) 

A = W 5 i(‘-T + -7 L ' ’) W (2017) 

The term m/c in the arguments of the integrands in (20.16) and 
(20.17) indicates by how much an electromagnetic wave coming from 
the more distant parts of the radiating system is retarded in compar¬ 
ison with a wave radiated by the nearer parts of the system. In 
other words, the term rn/c determines the time that the electromagne¬ 
tic wave takes to pass through the system of charges. If the velocity 
of the charges is equal to v, then in that time they are displaced 
through a distance of z;(r*n)/c. The retardation inside the system 
is negligible when this distance is small in comparison with the size 
of the system r. Therefore, if i;(r*n)/c r (or, more simply, v c), 
then the charges do not have time to change their positions noticeably 
during the time of propagation of the wave in the system. 

However, in order that nothing should really change in the system, 
the charges must also maintain their velocities in that time, because 
the vector potential depends on the currents, that is, on the particle 
velocities. If the charges at opposite ends of the system oscillate 
in phase, and in the time rn/c the oscillation phase reverses, the 
action of such charges is mutually weakened owing to retardation 
within the system. 

We shall now formulate the cqndition in which a retardation of 
an electromagnetic wave in a system does not result in an additional 
phase shift between individual emitters. Let the charges oscillate 
and radiate light of frequency a>. The wavelength of the light is 
equal to X = 2jic/o>. In the time rn/c the phase of the charge oscilla¬ 
tions changes by co(r*n)/c. This change must be small in comparison 
with 2ji at all points of the system, whence it follows that the size 
of the system must be small compared with the wavelength of the 
radiated light in order that the retardation inside the system should 
not produce a phase shift in waves coming from different emitters. 

Thus, the term rn/c in the argument of the integrand is immaterial, 
provided two inequalities, v c and r X, are fulfilled. 

Vector Potential and Field in the Dipole Approximation. Suppose 
that both inequalities are satisfied, so that the retardation within 
the system, that is rn/c, is everywhere small. We find that it can be 
completely neglected only in the expression for the vector potential A; 
in the scalar expression, using solution (20.16), a higher approxima¬ 
tion is necessary. Indeed, after putting rn/c = 0 in (20.16) we find 
that the integral refers completely to one definite time t — i?/c, 
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that is, it no longer depends upon r in the time argument. But then 
we would simply get 

< f > =jrSp( t ~T’ r ) dV= i 

where the charge, according to the conservation law, is, in general, 
time independent. The potential acquires the electrostatic value e/R 
and makes no contribution to the electromagnetic wave emission 
effect we are considering. 

Instead of making use of subsequent terms in the expansion of 
the scalar potential in powers of the retardation time within the 
system, we can eliminate the scalar potential altogether by changing 
the gauge of the vector potential A. Indeed, if the scalar potential 
does not depend on time, then instead of the Lorentz condition (20.3) 
we should take 

div A = 0 (20.18) 

This does not affect the values of the electromagnetic field com¬ 
ponents. 

Since the gauge of solutions (20.16) and (20.17) corresponded to 
the Lorentz condition, to meet requirement (20.18) a supplementary 
term must be added to the vector potential. 

It follows directly from Eq. (20.17) that, without taking into 
account the electromagnetic wave retardation within the system, 
the vector potential is 

A =-sr J *('-?■■ r ) dv (20 - 19 > 

Since here the time argument is not involved in the charge coordi¬ 
nates, we replace j by pv, remembering that now p refers to one 
time, t — R/c. If we are dealing with point charges, the integral 
of p over the domain including the charge is equal to the value of 
the charge itself. Hence we arrive at the expression 

1 

Here t — R/c denotes that the whole sum must be taken at that 
instant of time. But since t 1 = d^/dt, we obtain 

*-iT(24.-V (20 ' 2,o) 

i 

Here we have used the definition for dipole moment, (16.20). 

We note that (20.21a) involves only the time derivative d. Therefore, 
the transformation (20.21a), which corresponds to a constant shift 
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of coordinate origin, does not change d either for a charged system 
or for a neutral system. In particular, (20.21a) holds also for a single 
charge. 

The approximation (20.21a), in which A is expressed in terms of 
a derivative of the dipole moment of the system as a whole, is termed 
the dipole approximation. 

In Section 18 a potential gauge transformation for travelling 
plane waves was chosen such that the scalar potential became zero. 
We shall make the same gauge transformation for diverging spherical 
waves, that is, act in accordance with (20.18). 

In gauging the potential it should not be differentiated with 
respect to i?, which is involved in the denominator: each such differ¬ 
entiation increases the power of R by unity (because the potential 
is determined at a great distance from the radiating system). Each 
time it is sufficient to take the derivative with respect to R , which, 
being involved in the time argument, defines the retardation. Only 
terms inversely proportional to R make a contribution to the radi¬ 
ated energy (see further on). The unit vector r = R/i?, which appears 
in the differentiation, does not have to be differentiated a second 
time in the required approximation, because in the process super¬ 
fluous powers of R will appear in the denominator. 

We substitute expression (20.21a) into condition (20.18). Applying 
formula (11.37) for the divergence of a vector depending only on 
the absolute value of the radius vector, we get 


div = 

Rc 


nd 


Rc a 


This suggests that the gauge function / should be taken as 

/ = -^- ( 20 . 22 ) 

Then, adding to the vector potential the function grad /, we reduce 
the expression for the vector potential to the form 


n (n d) 


Rc 


Rc 


“l?F n X( n X d ) 


(20.216) 


Here the vector potential is denoted by the same symbol A as prior 
to the gauge transformation. 

Another operation is also possible: instead of adding to the vector 
potential grad /, we can find the scalar potential in the next appro¬ 
ximation. Then, instead of div A = 0, the equivalent Lorentz 
condition 

divA+i-^ = 0 

c dt 

is satisfied. 



256 


Fundamental laws 


Let us now calculate the electromagnetic field. As pointed out, 
we need to differentiate only with respect to the argument t — R/c. 
In calculating the magnetic field, we make use of (11.38) and of the 
fact that curl grad / = 0: 

H = curl A = curl d (t — R/c) = — n X d (20.23) 
The electric field is 


E 


1 <?A 

c dt 


¥isr"X <"Xd) 


= -fl^- n X( n X d ) = H X n (20.24) 

From these equations it can be seen that the electric field, the 
magnetic field, and the vector n are mutually perpendicular. In 



addition, | H | = | E |, since | E | 2 = | H X n I 2 = I H | 2 — 
- (H-n) 2 , and Hn = 0. Consequently, the wave at a point R at 
a great distance from a radiating system is of the nature of a plane 
electromagnetic wave. This result was to be expected, because the 
field is calculated far away from charges, where the wave front may 
be approximately regarded as plane and the solution becomes the 
same as obtained in Section 18. 

Figure 28 gives a general picture of the field. We situate the vector 

d at the centre of a sphere of large radius R so that d coincides with 
the polar axis or, in other words, is directed towards the “north 
pole”. Through the points where the field vectors are plotted, we 
draw the “longitude” and “latitude”. Then the electric field is tan¬ 
gential to the “longitude” and directed towards the “south”, while 
the magnetic field is tangential to the “latitude” and directed towards 
the “east”. It can be seen from Eqs. (20.23) and (20.24) that the 
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field vanishes at the “poles” and is maximum at the “equator”, that 

is in a plane perpendicular to d. Thus, the field distribution in space 
does not possess spherical symmetry. Since the field is transverse 
and perpendicular to the radius at every point, it cannot be spheri¬ 
cally symmetric for purely geometrical reasons. The zone at a large 
distance from the emitter in which the field is calculated according 
to Eqs. (20.23) and (20.24) is called the wave zone . 


The Intensity of Dipole Radiation. Let us now find the energy 
dissipated by the system in radiation. For this we must calculate 
the energy flux crossing a sphere infinitely distant from the emitter. 

As was shown in Section 15 (see (15.26)), the density of the energy 
flux, or the Poynting vector, is (c/4ji)(E X H). From this we find 
that the radiation energy crossing a unit surface at distance R from 
thejradiator in the direction of n in unit time is 

^EXH = ^ r (HXn)XH = ^-n|H|2 (20.25) 

The energy flux is directed along the radius from the emitter, as 
it should be in a wave zone in which the wave is almost plane. The 
total energy flux crossing the whole sphere of radius R in unit 
time is 

T = -ErJ <EX h )<®-5; J l H | 2 (n-dS) 

= ( 20 . 26 ) 

because n is directed along dS. Furthermore, the surface element 
dS = R 2 2n sin d dd, where d is the polar angle. From (20.23) 

| H P = ^|d> s in^ (20.27) 

Substituting (20.27) into (20.26), cancelling out i? 2 , and integrat¬ 

ing, we obtain the expression for the energy radiated in unit time: 


dE _ 2 1 d | 2 
dt 3 c 3 


(20.28) 


Note that, for sufficiently large R, none of the terms in the field 
expressions involving R in the denominator in higher than first 
power would contribute anything to the energy radiation. That is 
why only first degree terms in R were retained in the denominator. 
But if the problem is posed of computing the angular momentum 
flux rather than the energy flux, the greater accuracy is required 
in looking for the field at large distances. The angular momentum 
flux in the wave zone is zero, which can be ascertained from the 
formulas of Section 15. 


17-1452 
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Equation (20.28) expresses a result of fundamental importance: 
whenever a charge is accelerated it radiates energy. Indeed, d = 

= 2 e i ri - Hence, for d to differ from zero, accelerated motion of the 
charges is necessary, irrespective of the sign of the accelerations. 

Application of this result to the hydrogen atom reveals a striking 
contradiction with experiment. An electron moving finitely along 
any orbit must necessarily dissipate all its energy to the electro¬ 
magnetic field and ultimately fall on the nucleus. Actually nothing 
of the sort occurs—the atom is stable. 

This example reveals how completely inapplicable Newtonian 
mechanics is to the motion of the electron in an atom. In Part III 
we shall explain the stability of atoms with the help of quantum 
mechanics, in which the very concept of motion differs from the 
classical (Newtonian) view. 


Radiation Damping. We have thus found that a charge moving 
with acceleration continuously radiates energy, transmitting it to 
the electromagnetic field. We shall now show that, in the case of 
finite motion of a charge, the dissipation of its energy as radiation 
can be reduced to the action of a certain effective force of “friction”, F. 
We define this force by the equation 

—Fr (20.29) 


since the product of force times velocity is the work done in unit 
time. Equation (20.28) can be transformed as follows: 


dE 2 gs 

dt 3 c3 



3 c3 dt 


2 * 

or -~- 


r or 


We average the obtained relationship over time, taking into 
account that in finite motion the mean value of a total derivative 
is zero (see Sec. 17). Then from a comparison with (20.29) we find 
that the radiative reaction force is 

F=f^r (20.30' 


It is proportional to the third time derivative of the radius vector 
of the charge, or to the second time derivative of its velocity. 

By means of somewhat more involved computations it can be 
shown that the applicability of Eq. (20.30) is not restricted to a 
finite motion of a charge. The obtained expression is not only the 
mean but also the instantaneous value of the force of radiation 
damping . 

In developing formula (20.28) we neglected the retardation of 
electromagnetic waves within the system, and the expression is 
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therefore to a certain degree approximate. But in dealing with 
a point charge, taking into account the extra power of rn in the 
integrand in Eq. (20.17) yields a zero contribution at the limit, 
since for a point charge r tends to zero. For that reason Eqs. (20.28) 
and (20.30) are strictly applicable to point charges. 

Let us find the form of Eqs. (20.28) and (20.30) for the case of 
fast moving charges. For this we must obtain relativistically inva¬ 
riant formulas, which in the limit of small velocities transform 
into (20.28) and (20.30). We rewrite the first equation as follows: 



e % 


cPr 
dt 2 


2 


dt 


Here t is the time in the charge’s proper reference frame. In passing 
to an arbitrary reference frame we must express it, as we know from 
Section 13, in terms of an invariant quantity, the interval ds: 


Further, in four-dimensional notation, —dE = ic dp A , —dt = 
= (ilc) dx A . Besides, (d 2 r/dt 2 ) 2 = ( dui/ds) X ( du f /ds ), where u t = 
= dxjds, that is, four-dimensional velocity. Hence, in passing 
to an arbitrary reference frame, in place of the fourth components p A 
and x A we must write equations for all the components. This yields 


_ 2 e 2 du k du h 
aPi — 3 c ds ds 



(20.31) 


Now we can find the four-dimensional expression for the force F 
We start with defining the four-vector of force: 


771 


dui 

ds 


Ft 


(20.32) 


In the case of a Lorentz force, = (e/c) F jk u k . Multiplying both 
sides of Eq. (20.32) by we obtain in the left-hand side 


Ui 


dui 

ds 


i_ 

2 



But ul = —1, so that Ui(dui/ds) = 0. From this we obtain the 
condition which any four-dimensional force F t must satisfy identi¬ 
cally: 

F t u t = 0 (20.33) 

In addition, we must require that at the limit of low velocities F t 
transforms into the three-dimensional nonrelativistic expression 1 
(20.31). 

As was just shown, the second derivative with respect to time is 
expressed in terms of the second derivative with respect to the 
interval. Using the four-dimensional vectors cPiXj/cfe 2 and u ky we 

17 * 
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can develop the one expression which satisfies both requirements: 


Fi = 


2e* / 
3c \ 


d?Ui 

ds 2 


UiU k 


\ 

ds 2 ) 


(20.34) 


Indeed, multiplying it by u t , we obtain 


F 



— 

C 


CpUi 

Ip 


UiUi x u k 


d 2 u h \ 
ds 2 ) 


= 0 


in agreement with (20.33). Furthermore, in the nonrelativistic 
approximation, u a = 0 and u 4 = i, so that dujds = 0. Conse¬ 
quently, in such an approximation Eq. (20.34) contains only the 
first term in the parentheses in the right-hand side, which transforms 
into (20.31) for three spatial components. 

Thus, Eqs. (20.28) and (20.30), together with their relativistic 
generalizations, should be applicable to arbitrarily moving charges 
without any restrictions. 

But the expression for F allows for the accelerated motion of a 
charge in the absence of any external field! Indeed, assuming F to be 
the only force acting on the charge, we can write the equation 


(Pt 2 e 2 

dt 2 3 c 3 dfi 


with a solution 


r == 


r o 



3m c 3 
~2e*~ 


*) 


corresponding to self-accelerated motion of the charge. 

Thus, the electrodynamics of point charges is inherently contra¬ 
dictory. 

The difficulties are even greater if a charge is taken to be extended. 
It becomes necessary to assume the existence of some nonelectric 
forces which hold the charge together in a finite volume and keep 
it from flying apart under the action of Coulomb repulsion. But such 
forces have never been observed. Furthermore, if we assume the 
existence of a nonelectric force capable of acting on charges, then 
we must concede that electrodynamics is basically an open-ended 
theory incapable of resolving questions lying within the domain of 
its application. 

Actually the situation is not all that gloomy. The obtained absurd 
solution corresponds to self-acceleration of a charge in a time interval 
of the order of 10~ 23 s. Phenomena which take place in such small 
time intervals belong to the domain of the quantum theory, where 
classical notions break down. This will be shown in Part III. For 
reasons due to quantum theory, classical electrodynamics is appli¬ 
cable to phenomena which take place in not less than 10~ 21 s. This 
means the following. If a process occurring in, say, 10" 19 s is exam¬ 
ined, the error due to the use of the nonquantum theory is at worst 
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ol the order of 10~ 2 , while the error associated with the contradiction 
within the theory itself is of the order of 10~ 4 . In all cases this 
“intrinsic inaccuracy” is smaller by a factor of one hundred. That 
is why there is no sense in attempting to change the theory in a way 
that would eliminate small errors while leaving big ones. 

The quantum theory also leads to certain difficulties, but they 
can be overcome without introducing interactions of a nonelectro- 
magnetic nature. 

Magnetic Dipole and Quadrupole Radiation. We have indicated 
that charges must be in accelerated motion to radiate. But this 
condition, though necessary, is not sufficient. A simple example 
can be given when (20.28) yields zero even for accelerated charges. 
Let a system consist of two identical, charged and colliding parti¬ 
cles. According to Newton’s Third Law, their accelerations are 

equal and opposite in sign, so that d = 2 ev t = ^( r i + r 2 ) = 0. 
In this case the law is applicable because, to the dipole approxima¬ 
tion, the retardation of electromagnetic interactions inside the 
system is considered to be negligible and, hence, the interaction 
forces between the charges are regarded as instantaneous. But there 
is then no need to take account of the momentum transmitted to 
the field, and the total momentum of the particles is conserved, 

thereby leading to the condition r 2 = —r 2 . For this case Eq. (20.19) 
yields zero, so that it turns out that the waves emitted by each 
charge separately mutually cancel out, and it becomes necessary 
to use higher-order approximations. 

If in the expansion of the vector potential (20.17) in powers of rn 
a further term is retained in addition to the zero-power term, we 
obtain an expression which describes radiation that depends on the 
change in the quadrupole and magnetic moments of the system. 
But this expansion is not in inverse powers of i?, as in electrostat¬ 
ics and magnetostatics, but in powers of the retardation within 
the system. All corrections to the radiation field we shall now be 
looking for are inversely proportional to the first power of R. 

Write the corresponding term of the vector potential expansion: 

A '- ('•">-§"«'—(20.35) 

The derivative with respect to time is taken outside the integral 
sign, time being an independent variable. Transform the integral 
as follows: 

j (r n) = 4 j [( r * n ) J — ( n * j) r] dV 

+ yj [(r*n)j + (n.j)r]dF (20.36) 
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Here the first integral is 

[nX(rXj)]dF=4 j p [n X (r X v)l = X M 

that is, it is expressed in terms of the magnetic moment of the 
system (see (17.19)). 

Subtract from the second integral the expression (2/3)n £ (r-j) dV. 

It corresponds to a vector along n, that is, along the propagation 
direction of the wave at the given point. According to the general 
theory, such a component of the vector potential does not affect 
the field. After this, the second term is equal to 

Y j P [(r-n) v + (v*n) r—-|-n ( r " V )J 0 dV 

= Y 2 e t [^( n - ri ) + r i( n - vi )—-| n a(r i *v i )j (20.37) 

i 

The obtained result is the time derivative of the quadrupole 
moment multiplied by vector n (see (16.18)). If we define the vector 

Q a ==q a pnfi (20.38) 

then the expression for (20.37) takes the form 

Y j p[( r ’ n ) v + (v-n)r —-|n(r-v)j o dF = -|-(? 0 (20.39) 

Collecting now the obtained expressions, we can find the required 
term of the vector potential that describes the retardation of an 
electromagnetic wave within a system of radiating charges: 

A '= “it (" X A)“^Q (20-40) 

It contains the first derivative of the magnetic moment of the 
system and the second derivative of the electric quadrupole moment. 
Strictly speaking, (20.40) should additionally be gauged in accord¬ 
ance with condition (20.18). But this need not be done, bearing 
in mind that the expression for the magnetic field is not affected 
by this, while from (20.24) the electric field equals H X n > as i n 
any plane or nearly plane wave. 

It was already pointed out that when vie 1 and r/X 1, the 
retardation within the system is small. The ratio v/c is involved 
in the magnetic moment of the system, therefore the terms in the 
expansion in powers of the retardation that are proportional to vie 
yield magnetic dipole radiation. The quadrupole moment of the 
system contains an additional power of r in comparison with the 
dipole moment, therefore quadrupole radiation is associated with 
expansion terms proportional to r/X. 
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The field of a magnetic dipole emitter is similar to the field of 
a radiating electric dipole. Unlike the field reprer nted in Figure 28, 
a magnetic field radiated by a magnetic dipole lies in the same plane 

with p, that is, it is directed along the “longitude”, while the electric 
field is along the “latitude”. The formula for the energy radiated in 

this case is fully similar to (20.28), only instead of | d | 2 it contains 

| jli | 2 . Since the magnetic moment is proportional to vie, the radiated 
energy decreases ( vie ) 2 times in comparison with electric dipole 
radiation. 

The field of a radiating quadrupole is of a more complex shape. 
The expression for the radiated energy involves the square of the 
third derivative of the quadrupole moment of the system. In respect 
of the order of magnitude, a quadrupole radiator emits ( r/K ) 2 less 
energy in unit time than a dipole emitter. 

The electric and magnetic fields of an arbitrary radiating system 
calculated to the accuracy of the first power of the retardation (rn )/c 
consist of three components: electrodipole , magnetodipole , and electro - 
quadrupole. Accordingly, the Poynting vector contains mixed terms, 
that is, products of fields corresponding to emitters of different types. 
But the total radiated energy per unit time is composed of three 
terms due to each type of radiation separately. This is readily 
demonstrated as follows. 

The radiated energy is a scalar. The mixed terms due to different 
types of radiation ought to involve scalar combinations of the 
respective moments that are linear with respect to each of them. 
Since integration is performed over all directions, that is along 
vector n, the result can depend only on the product of two moments. 

But neither of them is a true scalar. Indeed, the product (d-jLi) is 

obtained from a vector d and a pseudovector jn (see Sec. 15), that is, 
it is a pseudoscalar, which cannot be equal to the radiated energy 

(a true scalar). Furthermore, d a and p a have one tensor index each, 
whereas the quadrupole moment, g a p, has two tensor indices. 
Hence a scalar linear in q a $ and d a or in q a $ and p a cannot be 

developed from q a $ and d a or p a . 

There remains the sum of three terms, each of which is a function 

of the square of d a d a , p a p a , or g a p g a p. Since the field of a radiating 
magnetic dipole is similar to the field of a radiating electric dipole, 
the total energy radiated per unit time by a magnetic dipole is 
given by a formula similar to (20.28): 



dEq 

dt 


(20.41) 
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It is more difficult to calculate the radiation of an electric quadru- 
pole. Its magnetic field is 

H, = -^nX Q* (20.42) 

Since in a wave zone the electric field is associated with the mag¬ 
netic field by the relationship E = H X n » we can write, by analogy 
with (20.26) 

dEn 1 (* • • • 

-ir=4^rS d °(»XQ)‘ 

= J [ |’Q I 2 - (n • Q) 2 ] (20.43) 

From this, using (20.38), we find 

~1T = ' 4^5- J {(«a?ap) («v?vp) — (n«Bp?a <7p) 2 } dQ 

The integrals over the angles are computed in the following way. 
The integral of the product of two components n a and n v is other 
than zero only when a = v. Hence the integral should be 

j n a n v dQ = a8 av 

To determine a we sum over a. Since n a n a = 1 and 8 aoc = 3, we 
obtain a = 4ji/3. 

The integral of the product of four components n a , n$, n v , and n y 
can be other than zero only when the subscripts of the components 
are equal in pairs in one of three ways: a = p, v = a = v, p = 
a = p = v. Hence 

j n a n$n v ni dQ =* b ( 8 a 3 ^v£ + fiavSpfc + SafcSpv) 

To find b we sum again over a. Then we obtain ^ n v n% dQ in the 

left-hand side and 58 V £& in the right-hand side, so that b = 4ji/ 15. 
We finally find that 


dEq \ . . 

dt = 5c5 ^ aP ^ ap 


(20.44) 


It was stated before that electrodipole radiation cannot occur 
in collisions of two identical charges. Since angular momentum M 
is conserved in collisions, the magnetic moment proportional to it 

is also conserved, and jui = 0. Consequently only electric quadrupole 
radiation remains. 
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EXERCISES 

1. Calculate the time that it takes a charge moving in a circular orbit 
around a centre of attraction to fall on the centre as a result of the radiation 
of electromagnetic waves. 

2. A particle with charge e and mass m passes, with velocity v , a fixed 
particle of charge Ze, at a distance p. Ignoring the distortion in the orbit 
of the oncoming particle, calculate the energy that this particle loses in 
radiation. 

A nswer. 


AF-2.fl l' iV.arf/-22!2 C dt n Z2g8 
3 c3 J 3 tfi a c 3 J (p 2 -f v*t 2 ) 2 3 m 2 c 3 p 3 y 

— oo — oo 

3. A plane light wave falls on a free electron, causing it to oscillate. 
The electron begins to radiate secondary waves, that is, it scatters the radia¬ 
tion. Find the scattering cross section, defined as the ratio of the energy 
scattered in unit time to the flux density of the incident radiation. 

Solution . We proceed from the fact that r = eElm. It is apparent from 
this that if the scattered light propagates perpendicular to the incident 
beam, scattered radiation is caused only by the component of E in the third 
normal direction. The scattered light turns out to 1^ plane polarized. In 

other directions the light is partially polarized. Knowing r, we determine 
dEldt from (20.28). Dividing by the energy flux c | E | 2 /(4ji), we obtain 

_ 8ji e 4 
° 3 m 2 c 4 

4. Find the motion and radiation of a charge elastically connected 
with a point in space so that the frequency of its free oscillations is co 0 * 
The charge is located in a magnetic field 

H z = | H I, H x = 0, H y = 0 

Solution . The oscillations of the charge are described by the equations 

mx— — mcogx|+ y | H | y * 

my = —mafiy j | H | x 


mz = — wcDqZ 

The third equation is independent of both the first two equations and the 
magnetic field. The first two equations are easily solved if we put x = ae 1(0 * 
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and y = be xtiit . Then 

a (Qj — co 2 ) — ico - b — 0 

me 

b (co 2 — co 2 ) -f- ico —^ ^ a = 0 
' me 

Now multiply the second equation by i y subtract it once from the first 
equation and add it once to the first equation. Then the combinations a ± ib 
satisfy the equations 

(a + ib) (cog — co 2 ) + (a ±]ib) co — ® 


Cancelling out ( a ± ib), we arrive at the frequency equations: 


co 2 —co 2 ± 


«|H|co Q 

me 


Assuming e\ H 1 1 (me) to be small in comparison with co 0 , we replace 
co by co 0 in the term e\ H |c ol(mc) and represent the difference cog — co 2 as 
(co 0 -f co) (co 0 — co), which is approximately equal to 2co 0 (co 0 — co). Then 
for the frequencies of both oscillations we obtain 
_ e|H| _ 

a)=co °- 4 -"iir^ a)o + a)L 

They differ from the unshifted frequency co 0 precisely by e \ H |/(2mc), that 
is, by Larmor’s frequency. 

After substiration of co = co 0 ± e\ H |/(2 me) in the equations for a 
and b , we have, in the same approximation, 

a — Hh ib 

Representing the coordinates in real form, we obtain for both oscilla 
tions: x = a cos (co 0 If col) t, y — a sin (co 0 Hh (Dl) 

Hence, the radius vector of normal oscillations of frequency © 0 + 
+ e\ H |/(2 me) rotates clockwise, while the radius of the oscillation with 
frequency co 0 — e\ H \l(2mc) rotates counterclockwise. Thus, in agreement 
with Larmor’s theorem, the frequency e\ H |/(2 me) is either added to the 
frequency co 0 or subtracted from it, depending on the sense of rotation of 
the charge. 

Let us investigate the radiation of such a charge in a magnetic field. 
The electric vector of a radiated electromagnetic wave lies in the same 
plane as the displacement vector of the charge (see Figure 28). If the radia - 
tion is due to the component of the dipole moment, its polarization vector 

is directed along the z axis, and the field is proportional to z. Hence, such 
radiation is plane-polarized and has a frequency co 0 . A charge oscillating 
along a magnetic field with the unshifted frequency co 0 radiates electromag¬ 
netic waves polarized in the plane with the magnetic field. A charge oscillat¬ 
ing in this manner does not radiate in the direction of the magnetic fiel 
at all. 



Electrodynamics 


267 


Oscillations with the frequencies cd 0 + e \ H |/(2 me) make for the 
radiation of circularly polarized waves, the propagation direction of which 
must coincide with the direction of the magnetic field. The electric field 
vector in such waves rotates in the same sense as the radius vector of the 
charge. 

If an emitter is placed in the field of an electromagnet, its radiation 
can be observed perpendicular to the field, while if a hole is drilled in the 
core of the magnet, the radiation is parallel to the field. In the former case, 
the observer records oscillations along the z axis, and one projection of each 
oscillation with right and left polarization. Therefore, both circularly 
polarized oscillations radiate plane-polarized waves of frequency co 0 Hh 
■±_e | H |/(2 me) in that direction. A three-fold splitting of the unshifted 
frequency in the spectrum occurs. 

Two frequencies in the spectrum are manifest when observation is along 
the field; here both oscillations are circularly polarized. When the observation 
is carried out at an angle to the field, the oscillations are elliptically pola¬ 
rized, and, in addition, there remains the unshifted frequency. 

The computations set forth here give the classical theory of the Zeeman 
effect. Actually, in the case of not very strong magnetic fields an entirely 
different picture of splitting of spectral frequencies is observed; it is correctly 
described only by the quantum theory. 

By observing the Zeeman effect of celestial bodies it is possible to judge 
the magnitude and direction of the magnetic field near them. 

5. Find the radiation energy dissipated in unit time by a charge rotat¬ 
ing in an external constant and homogeneous field with a velocity close to 
that of light. 

Solution . We apply formula (20.31). The quantity cPxfrlds 2 should be 
taken from (14.33): 


d?x h 
ds 2 


= ~^7 F ki u i 


Taking into account that only one component of the magnetic field is other 
than zero, for example, H z = F 12 — —F 2 1 , we obtain 

/ \2 «»|H|» . e*\H\*v* tdt \2 g »|H|W 

\ ds 2 / m 2 c 2 1 2 m 2 c 2 \ ds ) m*c& 

Hence the energy loss in unit time is 

dE _ 2 g* | h | a v 2 E 2 
dt 3 m 4 c 9 


6. Develop the field of a radiating quadrupole in the wave zone. 
Solution . Let the z axis be directed along the third principal axis of 

tensor q a 3 , which we denote simply Z) a p. Then the components of vector 
Q a , defined by Eq. (20.38), are given by the equations 


Qx — n xD 1 , 


Qy — ^yF>^ Qz= — n z D2) 
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The magnetic field projections in Cartesian components are: 

H x — (n X Q) x = n y n z (Z)j + 2D 2 ) 

Hy = (n X Q)y = (2 D t + D 2 ) 

H z = (n X # Qz) = *x*y (^2 — #l) 

From Cartesian components we must pass to spherical components r, 
ft, and cp using an easily developed array of cosines: 
r ft cp 

x sin ft cos <p cos ft cos cp — sin cp 

y sin ft sin <p cos ft sin <p cos cp 
% cos ft —sin ft 0 

and noting that 

n x ~ sin ft cos cp, n y ~ sin ft sin cp, n z = cos ft 

Hence 

H r H x n x -f- H y Uy -f- H z n z 

n x n y n z C^l "4“ 2 /^ 2 ) ~f“ ^'x^'y^'Z ( 2^1 “f“ H 2 ) 

~\~ n x n y n z C^2 — ^ 1 ) = ft 

as it should be in a wave zone. For the other two components of the magnetic 
field we have: 

Hq = H x cos ft cos cp + H y cos ft sin cp — H z sin ft 
= ( D 2 — D 1 ) sin ft sin cp cos cp 
H { p = — H x sin cp + Hy cos cp 

= (D-l + D 2 + D 1 cos 2 q> + D 2 sin 2 cp) sin ft cos ft 

For D 2 = D± we obtain for a single-axis quadrupole Hq = 0, as it 
should be according to symmetry considerations. At cp = 0 and cp = ji/2 f 
that is, on two perpendicular “longitudes”, H$ = 0; on the poles and the 
equator H^ = 0. 
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THE INADEQUACY OF CLASSICAL MECHANICS. 
THE ANALOGY BETWEEN CLASSICAL 
MECHANICS AND GEOMETRICAL OPTICS 

The Instability of the Atom According to Classical Notions. Ruther¬ 
ford’s experiments established that the atom consists of light nega¬ 
tive electrons and a heavy positive nucleus, very small in size in 
comparison with the atom itself. For such a system to be stable the 
electrons must of necessity revolve about the nucleus, just as the 
planets revolve about the sun: opposite charges at rest would be 
immediately drawn to one another. 

But this stability condition is quite insufficient. For one, in a gas 
the atoms collide continuously, whereas in condensed bodies (liquids 
or crystals) they are in constant close contact. It is hard to imagine 
how the atoms of every element retain their identity in such condi¬ 
tions. If, for instance, the solar system were to collide with another 
stellar system, the state of affairs in both systems would change 
after the collision. 

Besides, as pointed out in the preceding section, the electron’s 
motion in the atom would inevitably lead to radiation of electro¬ 
magnetic waves, and the dissipation of energy in the process would 
just as inevitably cause the electron to fall on the nucleus. This, 
of course, is in striking contradiction with the obvious fact of atomic 
stability. 

Bohr’s Theory. In 1913, Niels Bohr suggested a compromise as 
a way out of this difficulty. According to Bohr, an atom has stable 
orbits such that an electron moving in them does not radiate electro¬ 
magnetic waves. But in making a transition from an orbit of higher 
energy to one with lower energy, an electron radiates, and the fre- 
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quency of this radiation is related to the difference between the 
energies of the electron in these two orbits by the equation 

hoy = Ei — E} 

where h is a universal constant equal to 1.054 X 10" 27 erg-s. 

Both of Bohr’s principles were in the nature of postulates. But 
it was possible with their aid to explain, in excellent agreement 
with experiment, the observed spectrum of the hydrogen atom and 
also the spectra of a series of atoms and ions similar to the hydrogen 
atom (for example, the positive helium ion, which consists of a 
nucleus and one electron). Despite the fact that both of these quantum 
postulates of Bohr were completely alien to classical physics and 
could in no way be explained on the basis of classical concepts, 
they represented an extraordinary step forward in the theory of 
the atom. 

The first postulate states that not every state of the atom is stable, 
but only certain states. This, as we now know, is in agreement with 
observations and derives from quantum mechanics just as directly 
as elliptical planetary orbits derive from Newtonian mechanics. 

Bohr’s theory was very successful in explaining the spectra of 
hydrogen and similar atomic systems. But the very next step, a two- 
electron atom, such as the helium atom, did not yield to consistent 
calculation by Bohr’s theory. The theory was even less capable of 
explaining the stability of the hydrogen molecule. For this reason, 
the situation in physics, notwithstanding a number of brilliant 
results of the Bohr theory, was completely unsatisfactory. Besides 
the particular difficulties that we have noted here, the Bohr theory 
was, on the whole, eclectic, since it was inconsistent in its combina¬ 
tion of classical and quantum concepts. 

Light Quanta. The inadequacy of classical physics for an understand¬ 
ing of physical facts was apparent in very many other cases besides 
the issue of the stability of the atom. As far back as in 1900, Max 
Planck demonstrated that the state of thermal equilibrium between 
field and matter could not be satisfactorily described with the help 
of the classical laws of radiation, which treat it as a continuous 
process. In fact, Planck was driven to the assumption that radiators 
transmit energy to electromagnetic fields in discrete portions, or 
quanta . Each quantum possesses energy in proportion to its frequen¬ 
cy, the proportionality factor being the constant h mentioned before. 
In fact, Bohr made use of Planck’s assumption in formulating his 
second postulate. 

The hypothesis concerning light quanta proved extremely useful 
in explaining a wide range of phenomena. Of great importance was 
the quantum explanation of the photoelectric effect proposed by 
Einstein. When the surface of a metal bordering on vacuum is illu- 
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minated, electrons are ejected from the metal. The energy of each 
separate electron is found to be quite independent of the total energy 
of the incident radiation; it depends only on its frequency. The 
energy of an ejected electron is represented as the difference between 
two quantities: the energy quantum, few, and the work required 
to remove the electron from the metal. 

Expanding on Planck’s ideas, Einstein assumed that electromag¬ 
netic radiation is not only emitted and absorbed in batches, but 
also propagates as discrete quanta. 

Since the energy of a quantum is equal to few, and its speed is c, 
its momentum should be few/c (see Secs. 14 and 18). Hence, a quan¬ 
tum is a particle of zero mass, the possibility of the existence of 
which follows from relativistic mechanics. 

The momentum of quanta was discovered in the Compton effect. 
In terms of the classical theory, the scattering of electromagnetic 
waves by free electrons should appear as follows: the incident wave 
makes an electron oscillate, which causes it to become a radiator 
in its own right (see Exercise 3, Section 20). But then the frequency 
of the radiation scattered by the electron must coincide with the 
frequency of the incident electromagnetic waves. 

For an electron within matter to be treated as free the frequency 
of the incident radiation must be very high in comparison with 
the electron’s natural frequency. Therefore scattering of short-wave 
radiation (X rays) should be observed. In that case the natural 
frequency of the electron’s oscillations in the optical bend cannot 
appreciably displace the frequency of the scattered light (that, at 
least, is what should be expected in accordance with the classical 
theory). 

Reasoning in terms of the principles of classical mechanics, we 
can assume that the relative shift in the frequency of X rays in 
scattering on the atomic electrons of matter is the less the higher 
the oscillation frequency, or the shorter the wavelength. For a given 
wavelength, scattering at any angle should not result in a change 
in frequency, since the scattered radiation is emitted' by the same 
oscillating charge in step with the oscillations of the electromagnetic 
field of the incident wave. 

Compton’s experiments (1923) revealed an entirely different 
picture. The harder the incident radiation (that is, the shorter the 
wavelength) the greater was the observed reduction in the frequency 
of the scattered radiation (for a given scattering angle). For a specific 
frequency of the incident radiation, the frequency of the scattered 
radiation was found to be the lower the greater the scattering angle. 
A screen through which incident X rays passed freely almost com¬ 
pletely blocked rays scattered at a sufficiently large angle. 

These results cannot be explained by the classical theory—but 
they are beautifully explained in terms of quantum notions. The 
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scattering of X rays on a free electron should be treated as a collision 
of two particles. One of them—the electron—should be assumed 
at rest prior to the collision, the other—the light quantum—should 
be treated as a conventional mass point possessing energy and mo¬ 
mentum. The special properties of the quantum are taken into 
account only by putting its momentum equal to its energy divided 
by c. 

The energy and momentum conservation laws hold in collisions 
involving quanta in exactly the same way as they do in collisions 
of any other particles (see Sec. 6). Since in a collision with an electron 
a quantum imparts a certain momentum to it, its own momentum, 
and hence its energy, decreases. As a consequence, the frequency 
of the scattered radiation is lower than the frequency of the incident 
radiation. 

As in any problem on particle scattering, it is sufficient to state 
one quantity in order to determine all the others. Usually the deflec¬ 
tion angle of the initial incident particle is given. We examined 
a similar problem in Exercise 2, Section 14. Using the result of that 
problem, and taking into account that the energy of the incident 
quantum is E 0 = /uo 0 , the energy of the scattered quantum is E — 
= /uo, and the initial energy of the electron is me 2 , we obtain 


1 -j- he D 0 (1 — cos Q)/(mc 2 ) 

This gives the dependence of the frequency of the scattered radia¬ 
tion on the scattering angle 0. It agrees fully with experimental 
data. 

The notion of light as a stream of corpuscles was not new in 
physics, but by- the time the hypothesis of light quanta was enun¬ 
ciated the wave theory of light had been generally recognized, while 
the corpuscular theory seemed abandoned forever. Such phenomena 
as diffraction and interference of light can, in the classical theory, 
be interpreted only on the basis of the wave picture and utterly 
contradict the corpuscular one. On the other hand, the photoelectric 
and Compton effects are as hopelessly at variance with the classical 
wave theory. 

Thus, by the beginning of the 1920s the science of physics found 
itself in the unusual position of having to rely on two apparently 
fundamentally contradictory theories to explain various phenomena 
involving the same essence, the electromagnetic field. The way out 
of the situation was provided by consistent quantum theory, in 
which every type of motion possesses certain wave properties. 

Classical Newtonian mechanics, it was found, is restricted to 
macroscopic bodies, often breaks down when applied to microscopic 
entities, and is totally unsuitable in describing the motions of 
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atomic electrons or light quanta. Quantum mechanics embodies 
the exact criterion of applicability of classical notions. 

But before taking up quantum mechanics we must explain how 
one and the same entity can, as a matter of principle, display both 
corpuscular and wave properties. 

The Correspondence Between Geometrical Optics and Classical 
Mechanics. The corpuscular theory of light may have been due to 
the fact that in a homogeneous medium light rays propagate in a 
a straight line, like particles not subject to the action of the me¬ 
dium. But plane electromagnetic waves propagate in an infinite 
homogeneous medium just as rectilinearly: the normal to the wave 
front is constant in direction. That being the case, there exists a 
a reciprocal correspondence between the corpuscular and wave 
pictures. The difference between them appears when waves propagate 
in a restricted space. The one-to-one correspondence between a wave 
front and the direction of the wave normal is violated by the rela¬ 
tionships (19.10). Whereas in the case of a perfectly plane wave at 
every point in space one definite straight line (the wave normal) 
can be constructed and it can be compared with the path of a cor¬ 
puscle, in the case of a “smeared” wave front there is a cone of di¬ 
rections at every given point and no physical means of stating the 
corpuscle’s “actual” path. All directions within the cone are equally 
valid, but that means that, strictly speaking, none of them are 
valid. 

Constructions with light beams are used in geometrical optics. 
But such constructions, we have seen, are the more ambiguous the 
less the precision with which the direction of the wave normal can 
be defined. And the latter depends on the dimensions of the domain 
in which the electromagnetic wave propagates. As long as the do¬ 
main is in all dimensions large in comparison with the wavelength 
of the light, no diffraction (wave) effects come into play. But as 
the dimensions of the domain approach the wavelength of light 
the concept of a light ray becomes increasingly meaningless. 

Thus, on the basis of purely wave notions it is possible to establish 
a criterion for the applicability of the ray concept (that is, the cor¬ 
puscular picture) and establish the rules for the limiting transition 
from wave to mechanical concepts. This, in turn, indicates the way 
to pass from the corpuscular concepts of Newtonian mechanics back, 
so to say, to the wave concepts of quantum mechanics. 

But first it is necessary to establish the correspondence between 
mechanical and wave quantities. 

Surfaces of Constant Phase. First, let us determine the significance 
of the wave phase in light-ray optics. The very notion of phase 
would appear to be entirely alien to geometrical optics, but in fact, 

18-0452 
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as will now be shown, it is possible to give a definite mechanical 
meaning to the concept of phase. 

We shall begin with the expression for the field of a propagating 
electromagnetic wave in the form 

E = E 0 (r, t) cos JLiliiL (21.1) 


Here, X is the wavelength, regarded as small in comparison with 
the linear dimensions of the domain occupied by the field. In the 
limiting case of a plane wave the phase is 

-^=kr — at (21.2) 

(cf. (18.25) and (18.26)). But since k = 2nn/X and co = 2nu/X y 
where u is the phase velocity (see (19.7)), it is convenient to exhibit 
the X dependence explicitly by writing the phase as the quotient 
<P = 

The expression for the field in terms of phase must be substituted 
into the wave equation in order to specify the order of magnitude 
of all the terms that should be discarded for the transformation 
to the approximation of ray, or geometrical, optics: 

v2E -^=° ( 21 - 3 > 


In differentiating with respect to t and r, we retain only those 
terms that contain the highest degree of X in the denominator, because 
X is by definition a small quantity. That is why it is unnecessary 
to differentiate the amplitude of the wave packet, E 0 (r, t). Con¬ 
sequently 


dE 

dt 


J?2. s in — 

k dt s k 


dt 2 


Ep d 2 % 

k dt 2 


•HUSK*)’-* 


The first term in the second line should be discarded in comparison 
with the second term, containing X 2 in the denominator. Hence 


d*E 
dt 2 


E, 


( 1 H \ 

\ X dt ) 


2 Y 

C0S T 


ad similarly 

V 2 E» — E 0 ( y grad X ) 2 cos y 

Substituting these expressions into (21.3), we obtain a first-order 
ifferential equation for the phase cp = %/X: 

1 / dCD \ 2 


,2 
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In the limiting case of a plane wave, it follows from (21.2) that 
k = grad<p = -J- (21.5} 

and 


(0 = — 


d<p 

dt 


( 21 . 6 ) 


where for a plane wave 


k 2 --V = 0 


But according to (21.4) this equation is also satisfied by the quan¬ 
tities grad qp and —dq>/dt in an almost plane wave (21.1). Conse¬ 
quently, Eqs. (21.5) and (21.6) can be taken as definitions of the 
wave vector and frequency of an almost plane and almost monochro¬ 
matic wave. (We assumed the duration of the wave process repre¬ 
sented by Eq. (21.1) to be much greater than the oscillation pe¬ 
riod 2ji/g).) 

It is apparent from (21.5) that the wave vector is directed along 
the normal to the surface of constant phase qp = qp 0 , that is, it defines 
the direction of the light ray at the given point in space. The prop¬ 
agation of an almost plane wave is represented as a spatial dis¬ 
placement of a family of surfaces of constant phase. 

At different instants of time t a surface of some specific value 
qp = qp 0 occupies different positions in space according to the equa¬ 
tion 


<P (r, t) = qp 0 


Let us determine the speed with which this surface propagates. 
For that we proceed from the condition 

dr =° 


Let dr be a vector directed normal to the surface. Then | dqp/dr f 
is the absolute value of k. In accordance with the definition of phase 
velocity (19.7), we obtain from (21.5) and (21.6) 


dr 


1 d( P/dt | 

dt 


1 d<p/dr | 


The group velocity of propagation of the wave packet (21.1) can be 
determined with the help of (19.5) as 


v 


d(o 

~dk 


(21.7) 


It is essential that for an almost plane wave co can be expressed 
as a function of k, as is done for a plane wave. 


18 * 
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Similar Quantities in the Optico-Mechanical Analogy. InJ Sec¬ 
tion 10 we examined the propagation of surfaces of constant action 
of a system of identical mass points travelling along a bundle of 
paths. At the initial time the initial conditions for each such particle 
are stated. As the particles move the value of the action varies for 
each of them according to the equation 

t 

S= j Ldt 

to 

It was established before that the propagation of surfaces S = con¬ 
stant is described by a first-order partial differential equation in 
the form (10.20), in which we must substitute V = S if the action 
of the particles itself is taken as the function responsible for the 
canonical transformation. 

The Hamilton-Jacobi equation is fully analogous to the equation 
of the propagation of a constant-phase surface (21.4). The latter can, 
for example, be written as follows: 

[u 2 (grad (p ) 2 ] l /2 = — 


Then the left-hand side of the equation is similar to the Hamilto¬ 
nian, in which the wave vector k = grad qp has been substituted 
for the momentum and the coordinate dependence is involved by 
means of the quantity u . Equation (21.4) is even more like the 
Hamilton-Jacobi equation in relativistic form if we substitute m = 
= 0 into the Hamiltonian of a free particle (14.12) and replace 3$ 
by — dS/dt , and p by grad S. 

Thus, the propagation of light rays in a medium is fully analogous 
to the motion of particles of zero mass. The mechanics of these 
particles is defined by Eq. (21.4) to exactly the same degree as the 
mechanics of “conventional” mass points is by the Hamilton-Jacobi 
equation. 

To every quantity in mechanics there corresponds a similar quan¬ 
tity in geometrical optics. The similarity of the quantities can be 
established on the basis of a comparison between phase and action. 
It will be observed that frequency corresponds to energy, and the 
wave vector to momentum. Indeed, the correspondence between E 
and (o can be seen from (10.26) and (21.6): 


E= - 


dS 
dt ’ 


0 ) = 


dqp 

~dt 


while from (10.24) and (21.15) we see the correspondence between k 
and p: 


dS 


dr 


k 


dqp 

dr 
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But then the propagation velocity of a constant-action surface, 
according to (10.27), and that of a surface of constant phase have 
quite similar expressions: 

E , (o 

n — r and ttt* 

IpI |k| 

Finally, the velocity of a wave packet is analogous to the velocity 
of the particles: 

_ dE _ d(o 

v particle — 1 v packet — 

In Section 18 we found the law for transforming the frequency 
and wave vector, which coincides with the law for transforming 
the energy and momentum of a particle in passing from one reference 
frame to another. 

Corresponding optical and mechanical quantities differ only in 
dimension. Thus, phase has zero dimension, while the dimension 

of action is that of j L dt, that is g-cm 2 s _1 . The wave vector and 

momentum, frequency and energy also differ correspondingly by 
the dimension of action. The coefficient of proportionality must be 
the same in all relationships, because otherwise the optico-mechani- 
cal analogy would not be relativistically invariant. This coefficient, 
as we shall see later, is again the action quantum, or Planck’s con¬ 
stant h . 

In the following section it will be shown that the optico-mechani- 
cal analogy is a consequence of a limiting transition from the wave 
equations of quantum mechanics to the equations of classical me¬ 
chanics, similar to the limiting transition from the wave equation 
of electrodynamics to the equation of propagation of light rays. 


EXERCISE 


Proceeding from the fact that phase is analogous to action, show that 
light of given frequency is propagated along paths for which the propagation 
time of constant phase is least (the Fermat principle). 

Solution • At constant frequency 
2 2 



1 



The product (n *dr) is the displacement of the surface in a direction normal 
to it. It follows that u~*(n*dr) is thej>ropagation time dt . In accordance 
with the variational principle, which governs phase as well as the analogous 

quantity of action 9 the time t = f 2 dt is the least possible time. 
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In mechanics a similar principle holds when the energy of the particles 
is constant. Then the action should be written in the form 



But the momentum p is usually associated with the velocity of the particle 
itself (p = m\) and not with the constant-action surface (the Euler- 
Maupertuis principle of least action). 
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ELECTRON DIFFRACTION 

The Essence of Diffraction Phenomena. Classical mechanics is ana¬ 
logous only to geometrical optics, and by no means to wave optics. 
The difference between mechanics and wave optics is best illustrated 
by the example of diffraction phenomena. 

Let us consider the following experiment. Let there be a screen 
with two small apertures. Let us assume that the distance between 
the apertures is of the same order of magnitude as the apertures 
themselves. We cover one of the apertures (which we call the “first”) 
and direct a light wave on the screen. We observe the wave passing 
through the second aperture by the intensity distribution on a second 
screen situated behind the first. Then we cover the second aperture. 
The intensity distribution changes. Now we open both apertures 
together. An intensity distribution is obtained which in no way 
represents the sum of the intensities due to each aperture separately. 
At the points of the screen at which the waves from both apertures 
arrive in opposite phase they mutually cancel, while at those points 
at which the phase for both apertures is the same, they reinforce 
each other. In other words, it is not the intensities of light, that 
is the quadratic values, that are added, but the values of the fields 
themselves. 

This type of diffraction can occur only because the wave passes 
through both apertures. Only then are definite phase differences 
obtained at points of the second screen for rays passing through each 
aperture. (We disregard here the diffraction effects associated with 
the passage through one aperture. These phenomena are due to the 
phase differences of the rays passing through various points of the 
aperture. Instead of examining such phase differences we assume 
that the phase of the wave passing through each aperture is constant, 
but we take into account the phase differences between waves passing 
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through different apertures. Nothing is essentially changed by this 
simplification.) 

A somewhat more complicated picture is that of the diffraction 
of X rays in a crystal lattice, because the diffraction phenomena 
take place in three dimensions. It is best to imagine the lattice as 
a stack of planes filled with individual atoms or molecules. A wave 
passing through the lattice is partially reflected from each such 
plane at the same angle. But since the reflections occur at constant 
distances from each other, a constant phase difference appears be¬ 
tween the reflected waves, which depends on the angle of incidence 
of the wave on the crystal. The reflected waves may be mutually 
reinforced only if the phase difference is, depending on the angle of 
incidence, equal to 2ji, 4jt, 6jt and so on. 

The distance between the planes in crystals of simple structure 
can be calculated without resorting to X-ray diffraction (from the 
density, atomic weight, and the Avogadro number), knowing the 
number of atoms per cubic centimetre. The angle at which reflection 
from a given crystallographic plane is possible is related to the 
distance between the planes by a simple geometric condition involv¬ 
ing the wavelength (the Bragg law) so that the wavelength of X rays 
can be determined by measuring the reflection angle. 

Although the three-dimensional diffraction that occurs in a crystal 
lattice is more complex than the plane picture of diffraction at aper¬ 
tures, the cause of these phenomena is the same: just as a wave must 
pass through both apertures simultaneously for a phase difference 
to appear on the screen, the scattering of X rays must occur on all 
atoms of the crystal lattice. The same wave must be scattered on 
every atom of the crystal. Then the reflection condition is satisfied 
only at strictly definite values of the angles. No corpuscular picture 
could explain X-ray diffraction without taking into account the 
wave properties of radiation. 

Electron Diffraction. The picture is exactly the same in the scat¬ 
tering of electrons (as well as of other microparticles) on crystals. 
Electrons, as we know, act on a photographic plate or luminescent 
screen in a way similar to X rays, and direct experiments accordingly 
reveal that microparticles undergo diffraction governed by the same 
basic laws as electromagnetic wave diffraction. 

But for that each electron must be scattered on all the atoms of 
the lattice, since electrons travel entirely independently of one 
another: there can be no constant phase difference between them. 
They may simply pass through a crystal one by one, and the diffrac¬ 
tion picture will be the same as in the passage of all at once. 

In optics, it will be recalled, a diffraction pattern can be obtained 
when only one light source is used—beams from different sources 
are not coherent. It is necessary for the same light wave to pass 
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through both apertures: only then can the path difference maintain 
its constant value, which depends on the geometry of the instru¬ 
ments in the experiment as well as on the wavelength. In this case 
light and dark spots persist at the same points of the screen despite 
the fact that the rays emitted by the source at different times are 
not coherent. But this is of no consequence in observing diffraction: 
whenever the wave was emitted, on passing through both apertures 
it reconverges, reinforcing or damping itself on the screen, depending 
on the number of half-waves that fits into each path from the aper¬ 
tures to the respective points on the screen. 

Electron diffraction proves that in the microworld the laws of 
motion are in general of a wave nature, and that each electron is 
scattered on all the atoms of a crystal lattice. This is obviously 
incompatible with the concept of a definite path of an electron, 
just as X-ray diffraction is incompatible with ray, or geometrical, 
optics. 

Diffraction phenomena are proof that electron motion is associated 
with a phase of certain magnitude. 


Electron Wavelength. Just as the wavelength of X rays is deter¬ 
mined according to their diffraction, so the diffraction of electrons 
(or other microparticles) makes it possible to measure the wavelength 
associated with them. There exists a very simple relationship be¬ 
tween the wavelength and the velocity of the momentum p of a par¬ 
ticle. The wave vector of a particle is associated with its momentum 
by the relationship 

k = p/ft (22.1) 


This relationship was enunciated by Louis de Broglie several 
years before it was experimentally confirmed by C. Davisson and 
L. Germer, who first observed the diffraction of electrons on crystals. 

The constant h , or action quantum, was mentioned in the pre¬ 
ceding section. Formerly, a constant 2 ji times greater than h was 
commonly used, and the value of h used in this book was denoted h. 
Also, the number of oscillations per second, v, was used in the expres¬ 
sion for phase instead of the number of radians per second, co. 

The wavelength of an electron corresponding to (22.1) is 


2ji 2nh 2ji h 

k p mv 


( 22 . 2 a) 


and it is called the de Broglie wavelength. A quantity smaller by 
a factor of 2 n, 


% 


L=— 

k P 


(22.2 b) 


is often used. 
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Equation (22.2a) shows that the de Broglie wavelength corre¬ 
sponding to the motion of macroscopic bodies is extremely small, 
owing to the smallness of h with respect to all the quantities that 
could characterize the motion of such bodies. If the de Broglie- 
wavelength is expressed in terms of the natural units of macroscopic 
motion, centimetre-gram-second, for a body of mass, say, one gram 
and velocity 10 B cm-s -1 , it yields X & 10“ 22 cm. Obviously, it is 
impossible to observe the motion of a macroscopic body in a domain 
of such small dimensions, hence no diffraction phenomena can be 
observed. The applicability of classical (Newtonian or relativistic) 
mechanics to macroscopic bodies is in practice unrestricted. 


The Limits of Applicability of Classical Concepts. The relationships 
between quantities are entirely different when equation (22.2a) 
is applied to the motion of an electron in an atom. As mentioned 
before, the dimensions of atoms are easily determined by dividing 
the volume of one gram atom of a solid or liquid substance by the 
Avogadro number N A = 6.024 x 10 23 . The atomic radius is of 
the order of 0.5 X 10“ 8 cm. 

The order of magnitude of the velocity of an electron is evaluated 
as follows. Write the equation of motion of a charge in a Coulomb 
field 


Multiply both sides scalarly by r and transform the second deriva¬ 
tive with respect to time in the left-hand side thus: 


mrr = m — (r • r) — mr 2 = 


Ze 2 r 2 

r 3 


Ze 2 


u 


We obtain the expression for potential energy. Now average the 
equation over a certain sufficiently large time interval. Then the 
mean value of the total derivative is in any case zero: as we know 
from experience in normal conditions the atom is stable. But it 
follows from this that the mean value of its kinetic energy is equal 
to one-half the mean potential energy taken with the opposite sign. 
Substituting the obtained evaluation of the radius, r = 0.5 X 
X 10“ 8 cm, Z = 1 (a hydrogen atom), and m = 9 X 10' 28 g (the 
electron mass), we obtain 

v= (-^r) /2 « 2.2 X 10 8 cm-s' 1 


Thus, the wave number corresponding to the motion of an electron 
as defined by Eq. (22.2 a) is equal to 0.5 x 10~ 8 cm, that is, it is 
a quantity of the dimensions taken for the atom itself. 

If we assume that there is some electron path in an atom, then 
its total length is that of only one de Broglie wavelength. In that 
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case it is obviously meaningless to speak of a path at all: it must 
be totally smeared by diffraction phenomena. The picture is as if 
we were considering a wave packet one wavelength in size. The 
amplitude of the wave would be other than zero at all points of 
the packet, so that no curve drawn within the domain of the packet 
could be correlated with an imagined path of a particle or a light 
beam. In considering the motion of a particle within such a packet 
it can be described only in the framework of wave concepts. (This, 
of course, does not refer to the motion of the packet through space 
as a whole!) 

We thus see that the motion of an electron in an atom is essential¬ 
ly a wave phenomenon. The concept of path loses meaning. 

At the same time, it should be remembered that quantum de¬ 
scription does not deny the electron its properties as a particle, and 
it does not become a wave in the conventional sense of the word. 
For example, a part of an electron is never observed. If the second 
screen on which the diffraction picture appears is replaced by a pho¬ 
tographic plate it will display but one blackened point for every 
impinging electron. Only the configuration of the blackened points 
as a whole characterizes the diffraction pattern. It is thus more cor¬ 
rect to say not that the electron becomes a wave but that in the 
microworld the laws of motion are of wave nature. 

At the same time, diffraction would have been quite impossible 
if all atoms of a crystal were not actually involved in the passage 
of the same electron. An electron path of the form we are used to 
with conventional macroparticles simply does not exist in a diffrac¬ 
tion experiment. What specifically is of a wave nature in this case 
will be shown later. Whatever the case may be, in no experiment 
displaying the wave properties of motion is any splitting of electron 
charge or mass ever observed. 

This does not mean that the electron does, after all, possess some 
definite path which we are simply incapable of observing, owing 
to the inadequacy of our experimental equipment or physical know¬ 
ledge. Diffraction phenomena are specific proof that the electron 
has no path, just as diffracting light does not propagate in separate 
beams. In the diffraction experiment light passes through both 
apertures, which is incompatible with the concept of a single beam. 
In the same sense there is no such thing as an electron’s path in an 
atom: this is a firmly ascertained fact which cannot be refuted by 
any subsequent development of physics. 

Statistical Regularity and an Isolated Experiment. The absence of 
a path certainly does not mean that there are no regularities in 
electron motion. Quite the contrary—identical diffraction experi¬ 
ments, involving, of course, a sufficiently large number of electrons 
having the same velocities, always yield identical diffraction pat- 
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terns. There is thus no doubt that causal regularity exists, but it is 
of a statistical nature, appearing in a very large number of separate 
experiments, since each passage of an electron through a crystal 
can be regarded as a separate, independent experiment. 

Diffraction phenomena produce a regular pattern of dots on a 
photographic plate in the same way as a large number of shots at 
a target is subject to a scattering law. However, unlike bullets, 
which travel along paths and therefore produce a smooth distribu¬ 
tion curve of target hits, electrons produce a more erratic pattern 
of blackened spots, which characterizes wave motion. The scattering 
of bullet hits is due to the impossibility of exactly reproducing 
the initial firing conditions and can be reduced by better aiming; 
electron scattering, on the other hand, produces a regular diffraction 
pattern which can in no way be changed for a given velocity of the 
electrons. 

It should also be noted that statistical regularities in diffraction 
experiments have nothing in common with the statistical regularities 
that govern the motions of large assemblies of interacting particles. 
As repeatedly stressed, the same picture is observed quite irrespec¬ 
tive of how the electrons pass through a crystal: all at once or one by 
one. A certain phase governing the motion exists only because each 
electron interferes with itself. Of course, quantum laws of motion 
also influence the behaviour of large assemblies of particles, affecting 
the statistical laws inherent in assemblies, but unlike the classical 
laws of motion, they do not lose their probability nature in going 
over to individual electrons. 

The Uncertainty Principle. It remains now to examine in greater 
detail the question: In what cases does the concept of path of a micro¬ 
particle remain meaningful? The paths of particles in the Wilson 
cloud chamber, the cathode-ray oscillograph and many other instru¬ 
ments can be excellently precalculated according to the laws of 
classical mechanics, and visually observed. A moving electron in 
a Wilson cloud chamber actually leaves a very real cloud track. 

Let us first recall that under certain conditions light propagates 
along definite paths, rays or beams. Geometrical optics holds when 
the inaccuracy in stating the wave vector A k x , which is subject 
to the inequality 

Ak x Ax > 2ji (22.3) 

is small in comparison with k x (see (19.9)). Substituting A k x from 
Eq. (22.1), we obtain a similar inequality for a microparticle, an 
electron, for example: 


A p x Ax > 2nh 


(22.4a) 
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This inequality is known as the uncertainty relation of quantum 
mechanics. 

If we take the evaluation (19.66) instead of (22.4a), for (Ak x )(Ax) r 
a similar inequality is obtained: 

(Ap x )(Ax) > h (22.46) 

The concept of an electron path has meaning if the uncertainty 
of all three of its momentum components is small in comparison 
with the momentum itself: 

Ap x < Px, Ap y < p y , Ap z < p z (22.5) 

Note that although we have been using “electron” all along, simply 
to be specific, any microparticle is implied. 

It is easy to see why the uncertainty relations in no way prevent 
electrons from having paths, for example, in a TV-kinescope. For 
the sake of the evaluation, let us take the dimensions of an image 
element to be around 0.1 cm. Then from (22.46) the uncertainty 
in the transverse momentum component is 10~ 26 g-cm-s -1 . Since 
the electron velocity here is around 10 10 cm-s" 1 , its momentum is 
approximately 10 ~ 17 . Hence, the uncertainty in the angle specifying 
the direction of the momentum is of the order of 10“ 9 , and the inaccu¬ 
racy with which the electron beam reaches the screen does not exceed 
10“ 7 cm, yielding a “safety margin” of 10“V10“ 7 , or a million-fold. 

This agrees with the conclusion derived for an atom, where no 
such “safety margin” exists. An atom is 10 7 times smaller, so that 
A p ~ 10“ 19 ; the momentum of an electron in an atom, at a speed 
of 2 x 10 s , is also of the order of 10“ 19 ; that is, A p and p are of 
the same order of magnitude, and there can be no electron path 
in an atom. 

Thus, quantum mechanics does not abolish classical mechanics, 
but includes it as a limiting case, just as wave optics includes the 
limiting case of the geometrical optics of light rays. Moreover, 
quantum mechanics deals with the same quantities as classical 
mechanics: energy, momentum, coordinates, angular momentum; 
but the finite nature of the action quantum h imposes restrictions 
on the simultaneous applicability of any two classical concepts 
(for instance, coordinates and momentum) to one and the same state 
of motion. 

The momentum and coordinates of a particle cannot have precise 
values simultaneously because of the essentially wave nature of its 
motion. It is as meaningless to attempt to define their exact values 
as it is to seek an exact path of light beams in wave optics. Just as 
light rays cannot be made more precise by improving the optical 
instruments, no progress in measuring techniques will ever make 
it possible to determine an electron’s path to a greater accuracy 
than indicated by the relationships (22.4a) or (22.46). As we have 
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seen, the very concept of path or trajectory has the same approxi¬ 
mate meaning as the concept of a light ray or beam. 

Sometimes erroneous attempts are made to interpret uncertainty 
relationships. For instance, it is assumed that a path cannot be 
determined because the accuracy of the initial conditions does not 
exceed A p x and Ax, connected by relation (22.4a). This is to say 
that some actual path does exist somewhere, but it lies in a more 
or less narrow spatial domain and limited range of momenta. The 
“real” path is likened to the imaginary trajectory drawn from a gun 
to a target before firing. The path of the projectile is not precisely 
known beforehand, if only because strictly identical powder charges 
cannot be prepared. But this inaccuracy in the initial conditions 
for a bullet only leads to a smooth scattering curve for the hits on 
the target, while the distribution of electrons is subject to laws of 
wave diffraction: there exist minimum and maximum regions in 
no way associated with inaccurate knowledge of the initial condi¬ 
tions. Diffraction shows that no “real-though-unknown-to-us” tra¬ 
jectory exists. 

Uncertainty relations state not the error to which certain quanti¬ 
ties can be measured simultaneously, but to what extent these 
quantities have precise meaning in the given motion. It is this that 
the uncertainty principle of quantum mechanics expresses. The 
term “uncertainty” emphasizes the fact that what we are concerned 
with is not accidental errors of measurement or the imperfection 
of physical apparatus, but the fact of momentum and coordinate 
of a particle being actually meaningless for the same state of a micro¬ 
particle. 


23 


THE WAVE EQUATION 

The Wave Equation. Diffraction occurs because of the superposition 
of wave amplitudes. When the phases coincide the intensity, which 
is proportional to the square of the resultant amplitude, is maximum; 
when the phases are opposite the intensity is minimum. In electron 
diffraction the quantity analogous to intensity is measured according 
to the blackening of a photographic plate, that is, the number of 
electrons impinging on unit area. The alternation of maxima and 
minima and their relative configurations in electron diffraction are 
subject to the same law as X-ray diffraction. In order to explain 
the diffraction of electrons it must be assumed that their motion, 
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like the propagation of waves, can be associated with some wave 
function the behaviour of which determines the diffraction pattern. 

The blackening of the plate, that is, the number of electrons 
striking a unit area, is a physically observable quantity. Just as 
the number of hits in an artillery barrage is proportional to the hit 
probability in a given grid square, the density of electrons striking 
the photographic plate is in direct proportion to the probability of 
their occurring in the vicinity of a given point of the plate. Each 
electron impact should be regarded as an identical experiment 
the result of which is not predetermined and must be predicted on 
a probability basis, by stating the ratio between the probability 
density and the wave function of the electron’s motion. 

The analogy with the diffraction of an electromagnetic wave is 
extremely useful. The blackening of the plate caused by a wave is 
proportional to the square of the wave amplitude; it can therefore 
be expected that the probability of an electron occurring at a certain 
point is proportional to the square of the wave function. Since the 
observable quantity is the probability density, not the wave function, 
the latter must in the most general case be treated as a complex 
quantity the square of the modulus of which must be taken in order 
to pass to the probability density. Further on we shall see that 
a real wave function may correspond to far from all states. 

We shall assume the probability of an electron (or particle in 
general) occurring in a volume element dV to be related to the wave 
function by the following relationship: 

dw = | op ( x , y, z , t) | dV (23.1) 

where the wave function \|) depends on position and time; the square 
of its modulus is the probability density. 

The Linearity of the Wave Equation. Just as in electrodynamics 
Maxwell’s equations are used to investigate the laws of propagation 
of a wave itself, and the intensity is found by squaring its amplitude, 
so in quantum mechanics the task is to find an equation governing 
the quantity \|) rather than the probability density. For this equation 
to correspond to the similarity of the diffraction patterns observed 
for electrons and electromagnetic waves, it must, like Maxwell’s 
equations, be linear. 

Indeed, for the amplitudes to cancel out mutually they must be 
added up at every point in space. But for the sum of two solutions 
of the equation to satisfy it again, that is, also be its solution, the 
equation must necessarily be linear. Therefore, the equation govern¬ 
ing the wave function is linear. From such an equation can be deter¬ 
mined the wave phase, without which it is impossible to construct 
the diffraction pattern. 
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We thus obtain one of the prime principles of quantum mechanics: 
the sum of the two solutions of the wave equation is also its solu¬ 
tion. This assertion is known as the superposition principle. Together 
with the uncertainty principle, which states the condition for the 
transformation of quantum equations into the classical equations 
of motion, it provides the approach to the fundamental equation of 
quantum mechanics. 

It is convenient to start by writing not the equation itself, but its 
solution, that is, the wave function, as applied to a free particle. 
Such a particle possesses an exactly defined and constant momen¬ 
tum p. From the de Broglie relation (22.1), this momentum corres¬ 
ponds to the vector of a certain wave, k = p !h. Indeed, if electrons 
with momentum p are beamed on a crystal, they will present the 
same diffraction pattern as a wave with the wave vector k = p !h. 
It follows that the wave function of a free particle depends on the 
coordinates in the following way: 

\|) oc e ikr = e^ T ! h 

Only the square of the modulus of the wave function can be measur¬ 
ed, but not the function itself, which is why it is written in complex 
form. For physical considerations, the wave function cannot be 
required to be real. 

The dependence of the wave function on time is also easily deter¬ 
mined if we recall that wave frequency corresponds to the energy 
of a particle in the same sense that the wave vector corresponds to 
the momentum. It was shown in Section 21 that the coefficient of 
proportionality between 03 and E should be the same as between k 
and p. Therefore 

(o = E/h (23.2) 

From this we obtain the expression for the wave function 
particle: 

= q- i(Df+ikr _ q- iEt/h+ivr/h 

The group velocity of the waves is 
dco dE 

Thus, it coincides with the velocity of the particles, as it should 
in accordance with Section 21. 

But it is now apparent that the wave function (23.3) is simply 
related to the action of a free particle: 

y = e is / h (23.5) 

with S the action. Indeed, it is in this case equal to 

S = —Et + pr 


of a free 

(23.3) 

(23.4) 


(23.6) 
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in agreement with the fact that 

dS , o t-i dS 

p = ^ r = grad5, E=~ — 

ns it should be according to the equations of Section 21, which 
establish the analogy between mechanics and optics. Equation (23.5) 
confirms the relationship between the phase of a wave and the action 
of a particle obtained in that section. 


The Wave Equation for a Free Particle. We shall now develop the 
wave equation for the just obtained function op. Naturally enough, 
the form of this equation is related to the fact that E is expressed 
in terms of p, or what is the same thing, (o in terms of k. For example, 
in optics, where o 2 = c 2 k 2 , the wave equation should be written as 


V2q>_Li!^ = 0 

v ^ c 2 dt 2 


It is relativistically invariant. Our purpose is to first develop 
an equation of nonrelativistic mechanics, in which E = p a /(2m). 
As we know, the analogy between optics and mechanics in no way 
requires that the latter’s equations be written in a relativistically 
invariant form. The meaning of the analogy consists in the corre¬ 
spondence of the quantities. This is sufficient to obtain the equation 
for the wave function (23.3). We have 


dty 

dt 



doj? 

dx 



d *\|> 


dx 2 


—fn* 


(23.7) 

(23.8) 


From (23.7) and (23.8) we obtain 

__h jhp_ = „ h* / dh\> d*\p 

i dt V— 2 m V dx 2 ^ dy 2 



(23.9) 


that is, a nonrelativistic equation for the wave function, yp. In 
terms of the Laplace operator it is expressed as follows: 


h 

1 dt 


- 


(23.10) 


Equation (23.9) is valid because E = p 2 l(2m). 


The Schrodinger Equation. Let us generalize Eq. (23.10) for the 
case of a particle moving in an external potential field U(r). To 
obtain a relationship E = p 2 /(2m) + U analogous to E — p 2 l(2m) 
for a free particle, we must put 


h 

T ~dt 




(23.11) 
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This equation was developed in 192G by Erwin Schrodinger, who 
generalized the de Broglie relations for the case of bound electrons. 

Equation (23.11) follows directly from (23.10) for the simplest 
case of U = constant, because then it is satisfied by the same substi¬ 
tution of (23.3) but for the value of momentum p = [2 m (E — C/)] 1/2 . 
From here it is but one step to the generalization for the case of 
variable potential energy. 

This generalization, however, can in no circumstances be regarded 
as the “derivation” of a quantum mechanical equation from prin¬ 
ciples or equations of pre-quantum, classical physics. The Schrodin¬ 
ger equation expresses a new physical law. 

The authenticity of Eq. (23.11) is seen in the limiting transition 
to classical physics, similar to the transition from wave to geomet¬ 
rical optics. 

Let us assume that Eq. (23.5) involves the action not of a free 
particle but of one moving in a field £/(r), and find the approxi¬ 
mation that satisfies the Schrodinger equation (23.11), if the wave 
function in it is expressed in terms of the action. 

We determine the values of the derivatives of the wave function: 


^ = JL P is/h -1_ dS 


dt 

d\|) 


dt 6 
d 


1 = t-w* 

dS 


= ^ e is/h = J_ 
dx dx h dx ^ 


d 2 \[) _ i d 2 S 


dx 2 


k dx 2 




substitute them into Eq. (23.11) and cancel out h~ 2 . This leaves the 
following equation for S : 


dS 

dt 



^-v 1s + u 

(23.12) 


It is best to pass to the limit of classical mechanics by assuming 
that the action quantum h tends to zero. This is similar to the 
assertion that all classical quantities characterizing motion are 
large in comparison with h. Then from (23.12) we obtain 


dS _ (grarl S'! 2 

dt 2 rn 


(23.13) 


which is the Hamilton-Jacobi equation. 

The limiting process performed here almost exactly repeats the 
transition from wave to geometrical optics carried out in Section 21. 
Indeed, if we put h = 0, this is equivalent to the vanishing of the 
de Broglie wavelength, which corresponds to a transition from waves 
to paths. 

1 0—0452 
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We thus see that the Schrodinger wave equation does in fact yield 
a correct limiting process. It is as it were the fourth term in the 
relationship 


geometrical optics —classical mechanics 

l 

wave optics 


The vertical arrows denote the passage from rays or paths to the 
wave picture; the horizontal arrows denote the passage from waves 
to particles. The latter refers only to nonquantized electromagnetic 
field equations, since quantization requires corpuscular representa¬ 
tion of light quanta. The analogy here is between quantum me¬ 
chanics and classical wave optics. 

The Range of Applicability of Various Theories. The regions in 
which quantum mechanics and wave optics can be applied do not, 
strictly speaking, overlap anywhere; in wave optics or, what is just 
the same, electrodynamics, the velocity of light c is considered 
finite, and the quantum of action h is considered arbitrarily small. 
In nonrelativistic quantum mechanics c is considered arbitrarily 
large, while h has a finite value. A quantum theory of the electro¬ 
magnetic field, in which both h and c have finite values (that is, the 
velocity ranges are comparable with c, and quantities with the 
dimensions of action are comparable with h), has, in essentials, also 
been completed. At any rate, any concrete problem requiring the 
application of quantum electrodynamics, may be uniquely solved 
to any required degree of precision, and the results agree with experi¬ 
ment. 

Nonrelativistic quantum mechanics based on the relationship 
E = p 2 /(2m) + U is, within its sphere of application, as complete 
a theory as Newtonian mechanics. Like the equations of Newtonian 
mechanics, the wave equation (23.11) is valid only for particle 
velocities that are sufficiently small in comparison with the speed 
of light, but in the sphere of its application it is as firmly established 
as Newton’s laws of motion are for macroscopic bodies. 

Of course, quantum mechanics continues to perfect its methods 
of approach to various specific problems. The basis for this is the 
correctness of its general propositions. In this sense, Newtonian 
mechanics, too, is continuing to advance to this day. 

The Normalization Condition for a Wave Function. Let us return 
to the wave equation (23.11). We write it for a wave function and 
the complex conjugate in the equation for which we must re¬ 
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place i-by — i : 


-f&- 

h dty* 


2 m 

h* 


i dt 




We multiply the first equation by tp*, and the second by \|), and 
subtract the second from the first. The term is eliminated 

and the remaining terms give 


_**.*t_**^i 

i ^ ot i ^ dt 


(23.14) 


2m 


The left-hand side of the last equality is transformed to the form 

l*l 2 

We can write the right-hand side explicitly thus: 

——(\|?* div grad \J) 


h* 


— \|) divgrad i|)*) 


= —div (ty* grad \|) — \|) grad i|?*) 


(see (11.27)). Finally, we represent the equality in the following 
form: 

-i-|tj3| 2 = — div (Tl>*gradij) — i|j grad \|>*) ] (23.15) 

The left-hand side of this equality is the time derivative of the 
probability density of finding a particle close to some point of space. 
Let us integrate (23.15) over the whole volume in which the particle 
might be situated. If this volume is finite, then beyond its bounda¬ 
ries and *i):* must bo equal to zero. But then, from the Gauss theo¬ 
rem, the right-hand side of (23.15), transformed into a surface inte¬ 
gral (the surface being outside the volume), vanishes: 

4-JltlW-O (23.16) 

It follows that the integral itself does not depend upon time. 
If it has a definite value, as may be expected in integration over 
a finite volume, it can be given a simple physical meaning, namely 
that it must be equal to the probability of an electron, or any particle 
in general, occurring somewhere within a volume in which it is 
bound to be. Such a probability is equal to certainty, or unity.’ 
Thus 

j K l a dV = 1 


(23.17) 

19* 
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This equation is called the normalization condition for a wave 
function. 

If the motion is infinite, the integral involved in (23.17) is not 
finite. In that case we can either vary the normalization condition 
(see Sec. 25) or normalize the probability to a larger, but finite, 
volume and then tend it to infinity. The physical results are, of 
course, not affected by this. 

If we integrate (23.15) over an arbitrary volume, we obtain 

4r j M>NF= — j div [ 2 ^ 7 H 3 *gradi|)—grado|>*)]dF 

= — j ~2 m ~i £ ra( * ^ ^ ^ raC * (23.18) 

On the left we have the probability of an electron occurring within 
the given volume, on the right is the probability flux across the 
boundary surface of the volume. It is apparent from this that, accord¬ 
ing to (23.18), the density of the probability flux is 

> == 2^T^* gradl * , — ^ grad i|>*) (23.19) 

It follows that a real function yiel I : = 0 and does not describe 
electron current. Therefore the general . jfmition of a wave function 
must of necessity involve complex quantities. 

The Equation for Stationary States. Suppose that the potential 
energy does not depend on time explicitly. Then in classical mechan¬ 
ics, as we know, the law of conservation of energy of the system 
holds. The action of such a system involves the term — Et. But since 
in the classical limit \|) = e iS/h , we shall seek\|) proportional to e~ mfh 
in the most general case as well: 

i|) = e~ iEt/h y\) 0 (x, y , z) (23.20) 

Substituting this expression into (23.11) and abandoning the zero 
subscript, we obtain the equation 

v 2 tl> + £/T|) = £o|> (23.21) 

In this equation the energy E should be treated as a certain con¬ 
stant quantity. If we are then concerned with the states corresponding 
to finite motions, the probability of a particle occurring at an infinite 
distance from the domain in which ip is other than zero must be 
infinitesimal. Analyses of specific examples reveal that the latter 
holds not for all values of the energy E, but only for certain values 
belonging to a discrete, limited assembly. 

If, for example, U( 00 ) = 0, for negative values of E we have 
two values of the function i|) ~ exp [± (2m-1 E |) 1/2 rlh). One of 
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them increases exponentially as r—>-oo, and there is no way of 
normalizing it, because at infinite distance from the origin of the 
coordinate system exp [(2m | E |) 1/2 r/h] is infinite. There are, 
however, select values of E for which the coefficient of the exponen¬ 
tially increasing solution vanishes. It follows that in finite motion 
only certain definite values of E are possible. Their totality is called 
the energy spectrum of the system. 

The state of a system corresponding to such an energy value is 
called stationary , and the value itself is known as the energy eigen¬ 
value. In general, any specific value of energy that corresponds to 
a definite wave function in Eq. (23.21) which, in turn, satisfies 
appropriate boundary conditions is called the energy eigenvalue. 
The choice of boundary conditions depends on the type of motion 
being investigated. 

Thus, we find that, unlike energy in classical mechanics, in quan¬ 
tum mechanics energy cannot be stated arbitrarily. 


24 


OPERATORS IN QUANTUM MECHANICS 


Momentum Eigenvalues. At the end of the preceding section we 
defined energy eigenvalues on the basis of the wave equation (23.21). 
However, it is also very important to find the eigenvalues of other 
quantities: linear momentum, angular momentum, etc. In order 
to obtain the respective equations it is convenient to proceed from 
the form of \|) in passing to the limit of classical mechanics: 

ty==e iS t h (24.1) 

as it was done in deriving Eq. (23.21). 

Let us apply the operation (h/i)(o/dx) to both sides of Eq. (24.1), 
that is, we take the partial derivative with respect to x and multiply 
by h!i to get 


h dtl) dS .«/. dS 

— — — t — e ,s/h = t— it 

l Ox ox ox T 


(24.2) 


But in the classical limit S becomes the action of the particle, 
while dSIdx becomes the component of momentum p x (see (10.24)). 
Therefore, the eigenvalue equation for momentum that yields the 
correct transition to classical mechanics is of the form 


h d\b , 

T-£ = *** 


(24.3) 


where p x is the eigenvalue of the momentum projection on the x axis. 
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Momentum and Energy Operators. Let us compare Eq. (24.3) with 
the wave equation (23.21): 

M(TiY+(TiY+(T-!!7)‘]*+ u '>= E '> < 24 - 4 > 

Here, the symbol (d/dx) 2 denotes the second derivative, which 
must then be applied to \|). 

In order to find the energy and momentum eigenvalues we must 
perform a definite set of differential operations and multiplications 
by the function of coordinates in the left part of Eqs. (23.21) and 
(24.3). But these sets are connected in a very curious manner, as 
will now be shown. We shall call the symbol dldx multiplied by h/i 
the momentum operator applied to a wave function. Instead of 

(hli)(dldx) we symbolically write p x . Then it is necessary to rewrite 
Eq. (24.3) as 

Px^^Pxty (24.5) 

This equation denotes exactly the same as (24.3), but the symbol¬ 
ic notation p x emphasizes that the corresponding operation is 
applied in order to find the momentum eigenvalues. 

The operation on the left-hand side of (24.4) we shall also symbolic¬ 
ally denote by $&. We write St and not E because the energy is 
assumed to be expressed in terms of momentum, similar to the Hamil¬ 
tonian $g. Then, in shorter notation, (24.4) appears as 

= (24.6) 

* Comparing (24.4) and (24.3), we see that the momentum and energy 
operators are related by the same equations as the corresponding 
quantities: 

&£■ — ~2^ (P* + />£ + £z) + ^ (24.7) 

We have written U instead of simply U to emphasize that in this 

equation U is regarded not as an independent quantity but as an 
operator operating on ij), that is, a multiplication operator of U( r) 
by i|). Equation (24.7) is symbolic—it is understood that both sides 
are applied to i|). 

The usefulness of abbreviated operator notation in quantum 
mechanics is that the equations thus become more expressive. The 
relation between quantum laws of motion and classical laws, which 
are limiting cases with respect to the quantum ones, can be best 
of all seen in operator notation. 

If in classical equations relating mechanical quantities we replace 
the momenta by their operators, we then obtain correct operator 
.relationships of quantum mechanics. The limiting transition to 
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classical mechanics restores the usual relationships between quanti¬ 
ties. Indeed, in the limiting transition (24.1), the operator p = 
= (h/i)S7 gives p\|?. If we must perform the limiting transition 
for p 2 , then we need to differentiate only the exponential each time, 
because this yields the quantum of action in the denominator. For 
h ->0, only terms with the highest degree of h in the denominator 
remain, and it is these very terms that are obtained in replacing 
the operator p by the quantity grad S (that is, by the classical mo¬ 
mentum vector). 

But the advantages of operator symbolism in quantum mechanics 
are not restricted to limiting processes. They will be made apparent 
in the following discourse. 


The Operator of an Angular Momentum Component. It is now easy 
to determine the operator of the angular momentum projection M z . 
We know from Section 5 that the angular momentum M z is at the 
same time the generalized momentum corresponding, like the gener¬ 
alized coordinate, to the angle of rotation about the z axis: M z = 
= p { p. Then, from (10.24) 

* = •& < M -8> 

It is therefore clear why in quantum mechanics the operator 
should have the form 

~P* = T~k < 24 - 9 > 

But according to classical mechanics the projection M z is related 
to the Cartesian projections of linear momentum as follows: 

M z = xp y — yp x (24.10) 

Hence, there should exist the operator relationship 

p<t = Mz = xPv—yPx=^(x-^—y-^) (24.il) 

Let us verify that the definitions (24.9) and (24.11) do indeed 
coincide. We pass to cylindrical coordinates 


x = r cos <p, 
whence we have 


y = r sin (p 


dll? _ di|? dr dip d<p 

dx dr dx '""dip dx • 


dip dij? dr dip dip 

dy dr dy ' dip dy 


Expressing cylindrical coordinates in terms of Cartesian, we write 


r = (x 2 + y 2 ) il2 . 


dr x 

_ = _ = COS<p, 


dr y 

—— = — = sin ip 
dy r Y 


<p = arc cot —, 4^-=- \ = 

x » dx r 2 


sin ip 


dq? 


cosq? 


dy 
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Substituting all these expressions into (24.11), we find that it is 
indeed identical in meaning to (24.9): 



d\J) 

dy 



hr / . <9\h , cos (p \ 

= T [rcos<p(sin(p-^+— 

/ d\b sin <p dob \ ] 

-rsincp(coscp-£--T^)J 

h , « . . , > M h dyb 

= — (cos 2 cp + sin 2 cp) 

i v Y d(p i <?<p 


The other two angular momentum projections can be determined 
similarly to (24.10), but first it is necessary to establish the quantity 
that can exist simultaneously in the same physical state of the 
system. We saw in Section 22, for example, that the coordinate and 
corresponding linear momentum projection cannot exist simulta¬ 
neously. Let us now develop a general criterion in terms of the 
operators of the corresponding quantities. 


The Simultaneous Existence of Two Physical Quantities. Suppose 
that in a certain state described by the wave function there simul¬ 
taneously exist two physical quantities, X and v. Let us find the 
necessary condition for this. 

If in a certain state it is possible to define a physical quantity X , 
the wave function of that state must be the eigenfunction of the 
operator X. If a quantity v is also defined in that state, then the 
same function satisfies two equations 

= (24.12) 

vij) = vi|) (24.13) 

In other words, the function if is the eigenfunction of both the 
operator X and the operator v. Let us now operate with v on (24.12), 
remembering that in the left-hand side of this equation we have 
simply the quantity X , not an operator; we similarly operate with X 
on (24.13). We then obtain 

vX\f) = xXty = Xxty = Av\|) 

Xxty = Xxty = xXty == vXty 

Now subtract (24.146) from (24.14a) to get 

v Xty — Xvty = XviJ )—xXty = 0 

Since only the derivatives of numbers appear in the right-hand side, 
they cancel out, and we arrive at the required condition: 

v A,i|) = Xxty 


(24.14a) 

(24.146) 


(24.15) 
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Equation (24.15) can be written symbolically as an equality 
between operators: 

vX = Xv, vX —Xv = O i (24.16) 

This symbolic equality means that the result of a successive 
operation with X and v should not depend on the order of operation, 
otherwise Eqs. (24.12) and (24.13) cannot have the common eigen¬ 
function \|), that is, a common solution. 

We have proved here the necessity of condition (24.16) for the 
quantities X and v to exist simultaneously in the same state of a 
system to which the wave function i|) corresponds. We could also 
show that this condition is sufficient, but we shall not go into the 
proof here. 


Commutation Relations for Certain Operators. Let us now apply 
the obtained result to two quantities which definitely do not exist 
in the same state, the coordinate x and momentum p x . We must 
calculate the commutator p x x — xp x . 

Changing from symbolic notation to the usual one, we obtain 


h d , h d . h , . h d\p h d \|? h 

— Xty — X-r -— \b = — lb -4- X - - X — —= — 'll) 

i Ox T i Ox T i T i Ox i Ox i T 


Reverting to symbolic notation, we represent the obtained result 
in the following form: 

PxX—xp x = y (24.17) 

Thus, the result of operating with p x and x depends upon the 
order of their action; p x and x do not commute. And this was to be 
expected because the quantities p x and x do not exist simultaneously. 

The eigenfunction of the operator x satisfies the equation 

(x — x') \|) =0. Consequently, it is equal to zero over the whole 
region where the coordinate x is not equal to the chosen eigenvalue x . 
This function differs from zero only at one point x = x '. The eigen¬ 
function of the momentum operator which satisfies Eq. (24.3) is 
e w x x/h. (jjff erS from zero over all the space. This example shows 
how great the difference is between the eigenfunctions of operators 
that do not commute. The abbreviated notation (24.17) is a conve¬ 
nient representation of the preceding equation in which the momen¬ 
tum operator is expressed explicitly in terms of the derivative. 
In future we shall write commutation relations exclusively in opera¬ 
tor form, because the symbolic notation is more concise and vivid: 
one immediately sees the quantities represented by the operators. 

The various momentum components commute, sinco 


PxPy PyPx — 0 


(24.18) 
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(tho word “operator” will be frequently omitted in future as being 
self-evident). The commutation relation (24.18) is obtained simply 
from the fact that the result of applying two partial derivatives 
does not depend upon the order of differentiation. It is also obvious 
that 

p y x—xp y = 0 (24.19) 

In tensor notation these commutation relations take the form 

Pah — *3/>a = 7 - 6 a p (24.20) 

PaP» — PfiPa = 0 (24.21) 

x a xp — x$x a = 0 (24.22) 

We now calculate the commutator of two angular momentum com¬ 
ponents, M x and M y , where 

Mx = yPz—zPv’ M y = zp x —xPz 

We group the terms without violating the order of the coordinates 
and corresponding linear momenta to get 

M X M v — MyM x = (yp z —zpy) (hx ~ *Pz) 

— (zPx — xPz) (yPz — zPu) 

Making use of the commutation relation p z z — zp z = hli, we 
obtain the required result: 

M x My — M V M X = yp x CPJ — zPz) — xPv (& — zPz) 

= 7 - (yPx~xp y ) = ihM z (24.23a) 

Changing the indices x , y , z cyclically, we obtain the remaining 
commutation relations: 

M y M z -M z M y = ihM x (24.23 b) 

M Z M X —M X M Z = ihM v (24.23 c) 

All three commutation relations can be easily remembered if we 
write them in contracted vector form as 

M X M = ihM (24.24) 

Expanding this equality in components, we once again arrive at 
(24.23fl)-(24.23c). 

It will be noted that the vector product of an operator by itself 
may not equal zero if the operator components of the vector do not 
commutate (but, for example, p X P = 0). 
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The Square of the Angular Momentum. Let us now examine further 
properties of angular momentum. We shall show that even though 
two angular momentum projections do not exist, a single angular 
momentum ^projection exists together with its square 

M 2 = Ml + Ml + Ml (24.25) 

Let us verify this. For this we find the commutator 

M 2 M Z — M Z M 2 = Ml M z -JA Z M% + M\M Z — M Z M\ 

where the equality holds because M z and M 2 , of course, commute. 
Let us add to the right-hand side of the last equality and subtract 
from it the combinations M X M Z M X and M y M z M y ; we take M x 
and M y outside the brackets, once on the right and once on the 
left. Then we obtain 

M 2 M Z — M Z M 2 = M x ( M X M Z — M Z M X ) 

+ (M X M Z - M Z M X ) M x + M y (M y M z - M z M y ) 

+ (M y M z -M z M y )M v 
= — ihM x M y — ihM y M x + ihM y M x 

+ ihM x M y = 0 (24.26) 

Here we have made use of the commutation relations for separate 
angular momentum components. 

For subsequent applications the operator of the angular momentum 
square must be transformed to spherical coordinates. For that it is 
useful first to go over to tensor notation. For a separate component 
we have 


M a = EafiyXfiPy (24.27) 

whence the operator of the square of the angular momentum is 

M 2 = M a M a = £a&y€'a£T\£fiPyZZ ) PT\ 

Now we take advantage of the fact that e a p v e atT1 = vn — 
— which is proved by comparison with the vector equation 

(A X B) (C X D) = (A«C)(B*D) — (A.D)(B-C). Then M 2 can be 
rewritten as 

M 2 = XfiPyXfiPy — XfiPyXyPl 

After performing certain manipulations according to the rules 
(24.20)-(24.22), we reduce the right-hand side of the obtained expres- 
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sion to 


M 2 =4 ia<We + xlil — T — T ~~ (*<*P“) 2 


h - 


= *aPf>-- *a/>a — (*cc Ax ) 2 


But the expression x a p a is just (h/i) (rv), which in spherical coordi¬ 
nates is rewritten simply as (h/i)r(d/dr). Furthermore, pi = —ft 2 V 2 - 
Now substitute the Laplacian, written in terms of the spherical 
coordinates, according to (11.46). We then see that the square of 
the angular momentum is expressed simply in terms of the angular 
part of the Laplacian: 


M~ 



1 d 
r z bin ih oft 


sin ft 


dl 
L oV' 


1 


= - w —- 


d 


r z sin 2 O' 
d 


d 2 
ocp 2 


0 o'O 


— sin ft 


)-'- ,frJ t+ h ’" r T7 r i 


The Eigenfunctions of the Angular Momentum Square and the 
Angular Momentum Projection. Let us now find the eigenfunction 
of the square of the angular momentum, that is, the solution of the 
equation 


M 2 ^ = M 2 xp (24.29) 

For this we proceed from the well-known equality 

V 2 k y = 0 (24.30) 


We take l arbitrary constant vectors sl u a 2 , . . a z -, . . ., aj 
and form the following set of operators: (aj-v), (a 2 -V)> ••• 
. . ., (a*-V)> • • (a z -v)- Since the vectors are constant, all of the 

written operators commute with the Laplacian. After applying all 
the operators, we therefore write (24.30) in the form 


v 2 (a r V) (a 2 -V);... (a r V) ... (fl,.V ) 7 = 0 


(24.31) 


We express the Laplacian in terms of spherical coordinates and 
make use of (24.28) to get 


V 2 = -^ 


M 2 


dr 


dr 


h*r* 


(24.32) 


Now we take into account the circumstance that every operation, 
with (a,- -V) on r _1 raises the power of r in the denominator by unity, 
while in the numerator there is always found some function of the 
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angles. Therefore the result of operating l times with (a**V) on r 1 
can be represented as 


( a rV) ... (a r V)4 


/(0. <P) 
r / + l 


(24.3?) 


We transfer the part of the Laplacian involving differentiation 
with respect to r to the right-hand side of the equation. Then, differ¬ 
entiating and cancelling out r~ ( * +3) , we arrive at the equation 

NPf (ft, cp) = hH (l +1) / (ft, q>) (24.34) 

But this equation coincides in form with the eigenvalue equation 
for the operator M 2 (24.29). Hence, we have found the eigenfunctions 
of the operator M 2 and its eigenvalues hH (l -f- 1). 

The fact that the square of the quantity is equal to hH (l + 1 ) 
and not to the total square may cause some surprise. It would appear 
that if vector M is directed along a coordinate axis, the eigenvalue 
of the square should be equal to the square of the eigenvalue of the 
projection (we shall soon see that it is an integer). Actually, though, 
the angular momentum projections do not commute, so that if one 
of them takes a certain given value the others do not have any definite 
values, zero included. For this reason the expression for the angular 
momentum square is more involved than simply the square of an 
integer. An exception is the case when all three angular momentum 
projections are zero, which occurs when 1=0. 

Does the formula M 2, = hH ( l + 1) cover all the eigenvalues of 
the angular momentum square for integral values of l ? If we consider 
only the spatial motion of a particle with three degrees of freedom, 
then there exist only integral Z's. This is known as the orbital angular 
momentum. The term was borrowed from Bohr’s old theory, which 
assumed that particles travel along orbits. In the more general case, 
when a particle also has rotation-type internal degrees of freedom, 
half-integral Z's are possible, but for the time being we shall consider 
only orbital angular momentum. 

It follows from Eq. (24.33) that an application of the operator 
(a* -V) to the expression appearing in it always results in the appear¬ 
ance of scalar products of the form (a^r), or of constant numbers 
(a f -a a ). They are all single-valued functions of the coordinates. 
But such homogeneous functions can be obtained only for integral 
powers of Z. Nonintegral powers of Z in a homogeneous function 
signify nonintegral powers of the coordinates, that is, the single¬ 
valuedness disappears. But the eigenfunction of the operator M 2 
is the amplitude of the probability that for M 2 = hH (Z + 1) a 
particle has a polar angle ft and azimuth qp: 

dw= | (ft, cp) 
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This function must, by definition, be single valued. Therefore non- 
integral values of l for the orbital angular momentum are precluded. 

From Eq. (24.26), the operator of the square of the angular mo¬ 
mentum commutes with the operator of any of its projections. The 
same is apparent from (24.28) if we represent the angular momentum 
square in the form 


M 2 = 


h 2 


biu 0 d# 


sin 


d , Pi 

tf# ' sin 2 # 


(24.35) 


with the help of (24.9). Since angle cp is not explicitly involved in 
(24.35), M 2 commutes with p^. 

But this means that the quantity M 2 and the quantity caD 
exist in the same state, that is, states with definite values of M* 
and p { p can be described iri terms of the same eigenfunction. The 
eigenfunction of p^ is immediately apparent from (24.9). Writing 
the eigenvalue equation for p {p , we have 

= (24.36) 

whence 

t|) = e i<pp< F /h (24.37) 


For arbitrary vectors a* the eigenfunction Tf> does not correspond 
to (24.37). But it is not difficult to choose these vectors in a way 
such that an operation with (a, «V) would reduce the eigenfunction 
to the required form. For this we must select only two types of a*. 
We take Z — A: of these vectors as unit vectors along the z axis, and 
the remaining k vectors in the form of a linear combination of unit 
vectors along the x and y axes: 


a ± = n x ± i n y 

Then (a+ -V) = (d/dx) ± i (d/dy). The operation of (n 2 -V) = (d/dz) 
on r _1 yields z/r 3 . Repeated application of the same operator with 
respect to r ~ 3 again yields z and unity in the numerator, operating 
on the z carried over from the preceding differentiation. Applying 
the operator (d/dx) + i (d/dy) to r _1 , we obtain 

/ d . d \ 1 x-^-iy sin#e i<p 

\ dx ~^~ l dy ) r r 3 r a 


Obviously, repeated operation with the same operator multiplies 
the obtained expression once again by e i<p , besides leading to the 
appearance of powers of sin Indeed, once again differentiating 
the preceding expression in the same way, we find that 

Id , . d \ x-Viy _ t-M 2 3(rr-My) 2 _ 3 sin 2 
\ dx 1 1 dy ) r 3 r 3 ' r 5 r 6 


and so on. 
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Thus, if the operator ( d/dx) + i (dldy) is applied k times, the 
result is proportional to e ih *. This is the eigenfunction of the opera¬ 
tor py, if the eigenvalue p { p = hk. But since we have found that 

any eigenvalue of M 2 corresponds to the eigenvalue h 2 l (l + 1) for 
integral Z, and differentiation (d/dx) + i (dldy) is part of all the 
differentiations of the form (a-v)» we find that /r, too, is of necessity 
an integer. Obviously, it cannot be greater than Z. Furthermore, if 
the operator (d/dx) — i (dldy) is taken, then the eigenfunction 
contains — iky in the exponent, though again in absolute value k 
does not exceed Z. Thus, we have obtained all the eigenvalues p v = 
= hk: 

—hi ^ pc ^ hi, — l < k < Z (24.38) 

In the process we developed the simultaneous eigenfunctions of 
the operator of the square of the angular momentum and its pro¬ 
jection, M z . In mathematics these are known as spherical junctions* 
We have defined them as follows: 

«*■»> 

They satisfy two equations simultaneously: 

MWt h = m(l + \)Yt h (24.40a) 

M z Y? h = p v Yf h = ± hkYf h (24.406) 


The Stern-Gerlaeh Experiment. That the angular momentum pro¬ 
jections are integers is confirmed by direct experiment. The idea 
of the experiment consists in the following: a direct relationship 
exists between the orbital angular momentum projection and the 
magnetic moment projection (see (17.30)): 




(24.41) 


A narrow beam of vapour of a substance under investigation is 
passed between the poles of an electromagnet in a strongly inhomo¬ 
geneous field; to achieve this, one of the poles should be made tapered. 
The particles (in the Stern-Gerlach experiment they are atoms) enter 
the field parallel to the edge of the taper, that is, they move in 
a direction perpendicular to the plane of the lines of force of the 
field. The plane of symmetry of the field passes through the edge of 
the taper and the direction of motion of the particles. We assume 
the z axis to be perpendicular to the edge of the taper and to lie 
in the plane of symmetry of the field. If the angular momentum of 
the atoms has only discrete, integral projections on the z axis, then 
the magnetic moment of the atoms has only several definite values 
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corresponding to the angular momentum. The deflecting force acting 
on a particle possessing magnetic moment in a magnetic field is, by 
(17.35) 

U*.V)// z = ^ = /c-£^ (24.42) 

In the plane of symmetry of the field, H is in the z-direction and 
depends only upon z. 

Since the angular momentum M z can only have a definite set of 
values, the deflecting force acting upon the atoms in the beam also 



has a very definite value for particles with the respective angular 
momentum projection M z . It can be seen from (24.42) that the force 
is a multiple of (eh/2mc)(dH z ldz). Therefore, the particles in the 
beam experience only those deflections in the magnetic field which 
correspond to the possible values of the force (24.42). In other words, 
a beam of particles in a deflecting magnetic field does not take the 
form of a continuous fan, as could be expected according to the 
classical theory, but separates into as many discrete beams as the 
number of values k takes. 

^The number k, as we have just shown (see (24.38)), lies between 
the limits from — l to Z, that is, it acquires 2Z -f- 1 values. Figure 29 
presents the separation pattern for the case of Z = 1. As for the 
number l itself, it does not vary in the beam and corresponds to the 
state of the atoms in it with the least possible energy. This state is 
associated with a definite value of the square of angular momentum 
(in the classical analogy—with the “centrifugal energy”, as in Sec¬ 
tion 5) and usually does not vary as the substance evaporates. The 
thermal energy of evaporation is insufficient to alter the energy 
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state of an atom, therefore all the atoms travel with the same value 
of l but with random distributions of the projection of angular 
momentum k. 

The fact that k is restricted to integers is associated with the 
fact that two angular momentum projections cannot exist simulta¬ 
neously. 

This is easily linked with the Stern-Gerlach experiment. Indeed, 
the direction of the magnetic field (or the z axis) is taken to be 
completely arbitrary. All permitted values of the projection of 
the angular momentum on this axis are equiprobable. We could 
measure the angular momentum projections on a certain spatial 
axis and then pass the same beams through a magnetic field at a very 
small angle with the field in which the first measurement was carried 
out. Both measurements will, naturally, yield integers for the pro¬ 
jections. But the same vector cannot simultaneously have integral 
projections on infinitely close, but in other respects random, direc¬ 
tions: when the first measurement was undertaken, the angular 
momentum had projections only on the first direction of the field, 
and correspondingly, in the second measurement only on the second 
direction of the field. 

We find that just as the coordinate and linear momentum do not 
exist simultaneously, so two angular momentum projections do not 
exist in the same state. 


EXERCISES 

1. Calculate the commutators [ M x p J, [M x p y ], lM x p z ], lM x x\, lM x y], 
[Af 2 z], [ M 2 p x ], lM 2 p 2 ], and [Af a r 2 ], where the brackets symbolise com¬ 
mutators of the operators inside them. 

2. Prove that p x f (x) — / (x) p x = ( h1i)(dffdx ), where /(x) is a function 

of x . 

3. Verify that [a [be]] -f [b [ca]] + [c [ab]] = 0 (the Jacobi identity). 

4. If the commutation of two operators, [ab], is a number and not an 
operator, then 

e a + b =e [ab] e a e b 

where e" = 1 + jj • + •••• P rove this statement. 

5. Prove that in going over to spherical coordinates the following 
equations hold true: 

M ± s (M x ± tM 9 ) = he* (-^±i cot 0-^-) 


20-0452 
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EXPANSIONS IN WAVE FUNCTIONS 


The Superposition Principle. One of the most fundamental ideas 
of quantum mechanics is that its equations are linear with respect 
to the wave function ty. This result proceeds from the whole set of 
facts that confirm the correctness of quantum mechanics, in the 
same way as an analogous result in classical electrodynamics (see 
Sec. 15), which is also a generalization of experience. 

For example, the diffraction of electrons shows that the amplitudes 
of wave functions are combined in the same simple way as the ampli¬ 
tudes of waves in optics; diffraction maxima and minima are situated 
at the same positions, determined only by the phase relationships, 
independently of the wave intensities. All this points to the linearity 
of wave equations; the solutions of nonlinear equations behave in an 
entirely different manner. 

The sum of two solutions of a linear equation again satisfies the 
same equation. It follows from this that any solution of a wave 
equation can be represented in the form of a certain set of standard 
solutions, similar to the way that, in Section 19, a travelling non¬ 
periodic wave was represented by a set of travelling harmonic waves. 

The statement concerning the possibility of representing a single 
wave function in terms of the sum of other wave functions is called 
the superposition principle . 

Hermitian Operators. Wave functions can usually be represented 
with the aid of the sum of other wave functions which are eigen¬ 
functions of certain quantum mechanical operators. In the present 
section it will be shown how such expansions are performed. First 
of all, however, it is necessary to establish certain general properties 
of operators whose eigenvalues are measurable physical quantities. 

We have already had an example of measuring a physical quantity 
in the Stern-Gerlach experiment. The total number of particle beams 
resulting from the splitting of the initial beam in a magnetic field 
defines the number Z, that is, the angular momentum square, while 
the number of a given individual beam in the assembly of 21 4- 1 
beams defines the projection of the angular momentum hk on the 
selected axis. 

Obviously, the measured eigenvalues must be real numbers, 
although the operators themselves may depend explicitly upon 
i = V —1 (see (24.3) and (24.9)). We shall consider the equations 
for the eigenfunctions of the operator X and another equation involv- 
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ing its complex conjugate: 

XiJ) = A/iJ) (25.1a) 

= (25.16) 

We must find the condition for which the eigenvalues are real 
numbers, X* = X. 

To do this, we multiply (25.1a) by ip* and (25.16) by i|), integrate 
over the whole range of x (upon which the operators may depend), 
and subtract one from the other. This yields 

j (\|>*X'i|?— \|)1*\|)*) dx = (A, — X*) ^ 

But the integral of = | | 2 cannot be equal to zero, since 

| \|) | 2 is an essentially positive quantity. 

The eigenvalue of the observed quantity, X, is by definition real, 
that is X = X*; therefore we arrive at the relation 

^ (i|;*X\|; — ^A,*!):*) dx = 0 (25.2) 

Equation (25.2) must hold not only for the pair of complex con¬ 
jugate eigenfunctions of the operator X, that is \p(A,, x) and ty*(X, x ), 
but for any pair of functions %*(z), ^Or) which satisfy the same 
conditions (finiteness, uniqueness, continuity) as the eigenfunctions 
^(Jt, x ): 


j (X*^ — dx=0 


.(25.3) 


The necessity of such a more comprehensive formula in comparison 
with (25.2) will be explained later in this section. An operator for 
which Eq. (25.3) is satisfied is called a Hermitian operator and the 
corresponding property of such operators is termed Hermiticity. 

We shall now verify whether the operators introduced up till now 
are Hermitian. Take, for example, M z = (hli) (d/d(p), for which 
we obtain 


2n 2n 2n 

f d <V = .U* T w d(p = T \T~ U T li£ d V 


The eigenfunctions of the operator M z were found in the preceding 
section. They are equal to e ih ®. For integral k , such functions satisfy 
the single-valuedness condition *^(<p) = ^(<p’+ r 2jt), so that the 
functions %* and \|) substituted into the integral must also be single 
valued. The integrated part of the expression therefore vanishes 
when the limits are substituted, whence follows the Hermiticity 

20 * 
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of M z \ 

2jt 2jx 

j X*M z ydq>= ^ i| )Ml%*dy 

o o 

in agreement with the general requirement (25.3). 

In the case of M 2 we must take, instead of dcp, the solid angle ele¬ 
ment dQ. Then the Hermiticity of M 2 is proved by double integration 
by parts: over (p and overd. For SB (the Hamiltonian), dx corresponds 
to dV. To prove the Hermiticity of SB the integration must be per¬ 
formed with the help of the Gauss theorem. 

The Orthogonality of Eigenfunctions. An important property of 
eigenfunctions follows from the Hermiticity of operators. Let us 
consider the equations for two eigenvalues of the same operator k: 

kty (ky x) = kty (ky x) (25.4a) 

X.V *) = k'V (k', x) (25.46) 

We multiply (25.4a) by if* (k r , x), and (25.46) by (^, x), inte¬ 
grate over x , and subtract one from the other to get 

j [ty* (A/, x) ^ (ky x) —(X, x) (k'y x)] dx 

= (k — k') [ if* (^', x) ij) (ky x) dx (25.5) 

j 

The left-hand side of this equation vanishes in accordance with 
the general requirements for Hermiticity (25.3). Therefore, if k' #= 
= 7 ^= ky the following integral must vanish: 

J ty* (k', x) ij) (ky x) dx= 0 (25.6) 

This property of wave functions is called orthogonality . 

Sometimes several quantities ky v, etc. may correspond to the 
same state of a system. For that the operators ky v, ... must com¬ 
mutate. For example, for free motion of a particle there exist p x1 p yj 
and p z or, as we have shown before, M 2 can have an eigenvalue 
together with Af z . Then we develop functions that are simultaneous 
eigenfunctions with respect to all operators, the eigenfunctions 
being of the type gHpxx+pyv+pzzV/i or In the most general case 
the eigenvalue equations 

k\ |) (ky v; x) = (ky v; x) 

vi|) (ky v; x) = v\J) (ky v; x) 
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must hold if 

[Xv] = Xv —v A, = 0 
For such functions 

j yp* (V, v'; x) \|) (X, v; x) dx = 0 (25.7) 

if X' =^= X or v- =^= v. 

Expansion in Eigenfunctions. Let us suppose that the eigenfunc¬ 
tions of a certain operator X are known. These functions satisfy, 
in addition to the equation hp = taj), certain requirements associated 
with the conditions of the eigenfunction problem: they are finite, 
continuous, single valued, and so forth. Then, in accordance with 
the superposition principle, any function yp (x) which satisfies the 
same requirements maybe represented as the sum of the eigenfunctions 
of the operator X: 

^ (•*) = 2 x) (25.8) 

X' 

We shall show how to determine the expansion coefficients c 
For this we multiply both sides of the equation by yp* (X, x) and 
integrate over x: 

j yp* ( X , x) (#) dx = 2 c v j ty* x ) ^ (A.', x) dx (25.9) 

In accordance with the orthogonality condition all the integrals 
in the right-hand side of (25.9) vanish except the one with X' = X . 
Consequently, there remains the equation 

j yp* (a, x) i|) (x) dx = c\ j yp* (A,, x) \|) ( X , x) dx 
= cx j \yp(X, x) 1 2 dx 

We shall consider the eigenfunctions ij) ( X , x) normalized to unity, 

that is j | yp | 2 dx = 1 (see (23.17)). The normalization condition 

can be written together with the orthogonality condition (25.6) 
in the form of one equation with the help of the symbol 6 ^', where 
6 XX ' — 0 for X =t^= X' and 8 XX ' = 1 for X = X': 

j (X\ *) (X, x) dx = (25.10a) 

where 6 XX ' is, of course, not a tensor but simply a symbol with 
the stated properties. 
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For the expansion coefficient we obtain 

(A,, x) ( x) dx (25.11a) 

In the case when we have a system of commutative operators A, v, 
Eq. (25.11a) is directly generalized to 

c k,v= j (A, 'v; x) ^ {x) dx (25.116) 

if 

j* \|)*(A', v'; :r)\|)(A, v; x) dx= 8aa'5 VV ' (25.106) 

Thus, a state 1 \f> (#) in which the quantity X has no definite value 

because (x) is not an eigenvalue of X , is represented as the sum 
of states with strictly defined eigenvalues X . The component of 
the wave function corresponding to a certain value of X is 

( X , x) (25.12) 

It represents the probability amplitude of the given value of 
the quantity X in the state (x). To find the probability w % of the 
occurrence of the quantity X , we must eliminate the dependence 
on x , since x and X do not exist in the same state. 

For this we determine the probability density of the state with 
the given X , that is | c k | 2 | if (A, x) | 2 , and integrate over x. From 
the normalization condition for eigenfunctions we obtain 

w*. = I c kl 2 (Ix) | 2 dx = |!c ^| 2 (25.13) 


Let us now verify that the quantities | c k | 2 possess the basic 
property of probability: their sum over X is equal to unity if the 
function if (x) itself satisfies the normalization condition (23.17). 
Indeed, making use of the orthogonality condition (25.10a), we 
obtain 


1 = j | \|) (x) | 2 dx = 


f (2 c ^* (^» x ) 2 Cx ' i i 5 (*-'» x )) dx 

X X' 


XX' X 

Comparing with (25.13), we have 


2^=2 M 2 =i 

X X 


(25.14) 


1 For the sake of brevity it is customary to speak of a state \|) (*) instead 
of “a state with wave function (z)'\ 
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Thus, the coefficient c % should be regarded as a probability ampli¬ 
tude similar to \|? (x). But \|? (x) is connected with the probability 
of finding a particle with coordinate x independently of X, while 
with the probability of finding it with the given value of X inde¬ 
pendently of x. 

Now we can return to the meaning of the condition for the Hermi- 
ticity of operators, (25.3). From this condition derives the orthogo¬ 
nality of eigenfunctions with different values of X. If a system is in 
a state with a certain eigenvalue X, the probability amplitude of 
another value X' =^= X occurring in that state vanishes, because it 
follows from (25.11a) that 

c v ^ J (X ', x) ( X , x)dx = 0 (25.15) 

provided X' X. Thus, from the Hermiticity of an operator follows 
the reality of its eigenvalues and the possibility of “pure states” 
with exactly defined values of the corresponding quantity. 

An expansion of a function i|) (:r) in the eigenvalues of operator X 
is very like the expansion of a vector in unit vectors directed along 
the axes of a Cartesian coordinate system. The part of these unit 
vectors is played by the eigenfunctions of the operator, ij) ( X , x), 
and the part of the vector projections on the axes by the expansion 
coefficients, c k . The normalization condition is similar to the choice 
of unit vectors for the expansion (n (i) ) 2 = 1, and if the expanded 
vector is of unit length, that is, itself normalized, the sum of its 
projections is also unity, like the condition 2 I c K | 2 = 1. 

The definition of the magnitude of the projection of a vector A 
on an axis in tensor notation is 

A t = A a n { ^ 

Then 

A = 2 Aj-nW 

i 

These two formulas should be compared with (25.8) and (25.11a). 
Summation over the tensor index a is similar to integration over x\ 
summation over the unit vector’s number i is similar to summation 
over X. We as it were have a vector i|? (x) in a space with an infinite 
number of dimensions instead of the three dimensions of Euclidean 
space. Besides, the “length” of such a vector is defined not as the 
sum of the squares of its components but as the sum of the squares 
of their moduli, since the vector is complex. 

The comparison of \|) (x) with a vector can be continued. For that 
we refer to Eq. (9.9a), which is used to find the principal axes and 
principal values of the inertia tensor / a p. This equation is similar 
to (24.12), from which we define the eigenvalues of the operator X 
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and the eigenfunctions of (X, x), which, as was just pointed out, 
are analogous to unit vectors along the Cartesian axes. The orthogo¬ 
nality of wave functions has an analogy in the perpendicularity of 
the principal axes of the inertia tensor: 


It was shown in Section 9 that the perpendicularity of the prin¬ 
cipal axes of the inertia tensor derives from the symmetry of the 
tensor, / a p = /p a . In a space of complex vectors the requirement 
in place of symmetry is the Hermiticity of the operators. We shall 
later show that symmetry and Hermiticity are written in very similar 
fashion. 

The mathematical concept of a complex space similar to Euclidean 
space was of great help in formulating ^the [laws of quantum me¬ 
chanics. It is known in mathematics as Hilbert space. 

Expansion in Eigenfunctions of the Angular Momentum Projection. 
We shall explain the meaning of expansions in eigenfunctions with 
the example of the Stern-Gerlach experiment. A beam of atoms 
splits into a certain number of separate beams according to the 
number of projections of the angular momentum on the magnetic 
field, M z = hk. If the greatest value of the projection is equal to hi, 
then k , as pointed out before, assumes 21 + 1 values, from —l to Z, 
changing by unity. 

The eigenfunction corresponding to M z = hk is 

<P) = |(25.16a) 

(2ji) 

where the factor (2 ji) -1 / 2 is introduced for normalization: 

2jt 

j | | 2 dq> = 1 

o 

If each of the separate beams is once again passed through a magnet¬ 
ic field parallel to the z axis, there is no further splitting; this is 
because M z in these beams has a single definite value and not the 
whole set of values in the range —Z ^ k ^ Z, as was the case in the 
initial beam. From this the meaning of the orthogonality of eigen¬ 
functions is very well seen. If a particle is found in a beam corre¬ 
sponding to a given value of k , then the probability of finding it in 
a beam with a different value of the projection M z = hk ' =^= hk is 
equal to zero. 

From the general rule, the probability equals the square of the 
modulus of the expansion coefficient c h > of the function (k, cp) 
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in functions of (k\ cp), that is, according to the general formula 

2 n 2n 

Ch'= [ ^*(*, ( P) d( P = 4r ( e i<p( ' ! '-' ,) dq> 

o o 

1 e i<P(h'-h) 2 tc f 0, k'=£k 

~~ 2 jt i (k f — k) o ~~ { 1, k' = k 

If the second magnetic field is along the x axis, then splitting 
will again occur due to the component of angular momentum M x , 
which does not exist simultaneously with M z . The number of splitting 
components is again equal to 21 + 1, since it is determined by tho 
maximum angular momentum projection l . This quantity cannot 
depend upon the direction of the magnetic field, and is related only 
to the atomic states in the original beam. 

The eigenfunctions of M x are 

^ = (^172 (25.166) 

where —l ^ k x ^ Z, and co is the angle of rotation about the x axis^ 

Functions (25.16a) and (25.166) do not coincide, which is a natural 
consequence of their being functions of noncommuting operators. 

As a result of magnetic splitting in a field directed along the x 
axis, a beam with given value of k is split into 21 + 1 beams with 
definite values of M x . Hence, the function (25.16a) is represented 
as the sum of functions (25.166): 

ij)(Ar, <p)= 2 to) (25.17) 

kl=-l 

The square of the modulus, | c k > | 2 , indicates the proportion of 
particles from a beam with a given value of k that will occur in 
the beam corresponding to the angular momentum projection on 
the x axis equal to hk x . 

If the second magnetic field is not oriented along the x axis and 
makes a small angle with the initial axis z, then the largest square 
of the modulus among expansion coefficients in (25.17) occurs for 
the one for which the value of the projection on the new axis {k x ) 
closely approximates the projection on the old axis (k). The other 
coefficients are small. 

The Wave Function and the Measurement of Quantities. The fore¬ 
going example of determining the angular momentum projections 
explains the role of measurement processes in quantum mechanics. 
Before the Stern-Gerlach experiment was carried out with the initial 
beam, that is, before the particles passed through a magnetic field,. 
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nothing at all could be said of the value of their angular momentum 
projections. Depending on the orientation of the field the angular 
momentum projections were obtained either on the z axis or on 
the x axis, but, of course, not both simultaneously. 

From this can be seen the special part that measuring instruments 
play in quantum mechanics, which is substantially different from 
their part in classical (nonquantum) physics. For whereas classical 
measurements have an infinitesimal effect on the object being meas¬ 
ured, measurements carried out on microscopic entities may affect 
them so greatly as to simply preclude other simultaneous measure¬ 
ments. 

For example, when measuring the angular momentum projection 
on the z axis, it is impossible simultaneously to measure another 
projection. This follows convincingly from the Stern-Gerlach exper¬ 
iment. In measuring the coordinate of a particle, we cannot at the 
same time measure its linear momentum in the corresponding 
direction. To measure the coordinate to an accuracy of Ax , the 
particle must be passed through a slit of width Ax , but then diffrac¬ 
tion produces an inaccuracy in the linear momentum of Ap x ^ 
^ 2nh/Ax. Here the measuring instrument is a slitted screen. Thus, 
in considering measuring processes we must reckon with the measur¬ 
ing instrument, which is a classical entity, like a magnet or a slitted 
screen. 

The determination of any physical quantity is inseparably linked 
with the method of measuring it. In classical physics the connection 
is less apparent, since the measurement has an infinitesimal effect 
on the measured object. In quantum mechanics the reverse is true: 
as a rule the state of the measured object after the measurement 
differs from what it was prior to the measurement. In accordance 
with the uncertainty principle, it is therefore meaningless to carry 
out simultaneous measurements of certain quantities. Thus, the 
principle is substantiated in analysing the measuring operation. 
In this sense, the action quantum h as it were measures the effect 
of the instrument on the micro-object; for example, it states the 
uncertainty of the linear momentum appearing in the particle’s 
passage through the slot. 

It should not be imagined, however, that the experimenter “inter¬ 
feres” in some way with the results of physical measurements. As 
a result of experiments on a large number of identical objects we 
can determine the state they were in prior to the measurement quite 
independently of the method of measurement. All the experimenter 
does is select the method (for example, the direction of the magnetic 
field in the beam-splitting experiment); but he then obtains a very 
definite number of splitting components and a definite intensity of 
every beam. If a single passage of a beam through a magnetic field 
is being studied, the experimenter always has 21 + 1 splitting 
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components, from which he draws the conclusion that in the given 
beam the angular momentum projection had no definite value. 

Now let one of the beams, with a given value of k, be passed into 
another chamber, and let a second experimenter perform the Stern- 
Gerlach experiment again. 

If the magnetic field in the second experiment is oriented arbitra¬ 
rily, he will obtain, as mentioned, 21 + 1 beams, but by varying 
the direction of the magnetic field he can achieve a state in which 
no further splitting occurs. This happens when the second magnetic 
field is directed parallel to the initial one. Hence, the second experi¬ 
menter will conclude that the particles in the incoming beam have 
a wave function of the form (25.16a). Here the measurement makes 
it possible to establish that prior to the measurement process the 
particles were in some definite “pure” state. 

If the field is so oriented that a new splitting of the particle beam 
is observed in the second experiment, then the wave function in all 
of the resultant beams will not coincide with the initial function. 
But there will be a clearcut relationship (25.17) between them, 
which is quite independent of the will of the experimenter making 
the measurements. 

Quantum mechanics does not make the result of a measurement 
process subjectivistic; it simply restricts the possibility of simulta¬ 
neously carrying out certain measurements. Either one or another 
is possible, but not both at once. This statement is known as the 
complementarity principle. In effect it is equivalent to the uncertainty 
principle 

Mean Values in Quantum Mechanics. We have seen that in quan¬ 
tum mechanics measurement of a quantity need not necessarily yield 
a single strictly defined value. The probability of a certain value 
being obtained in a measuring process is unity, that is certainty, 
only in the “pure” state. In the most general case, the value of 
a measured quantity X is obtained with the probability w Let us 
find the mean value of the measured quantity according to the 
definition 

= (25.18) 

Substituting w x from (25.13) and c x from (25.11a), we have 

(X) = 2 ^ Il 2 = 2 = 2 ^ ^ (*) (^> x ) dx 

K X k 

Making use of the condition that the eigenvalues are real, we replace 
the product A/i|)*(^, x) under the integral by x), and first 
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sum and then integrate. Then we obtain 

(X) = f ij)(:r) 2 (X, x)dx 

x 

But the operator X* does not depend upon any definite value of k 
(for example, if k = p x , then X* = — (h/i) ( dldx )). Therefore we 

a 

take X* outside the sign of the summation to get 

(k) = f ip (a:) X* 2 dx 

x 

But the sum 2 ^x^* (^» #) — t|>* (#), since this is an equation 
which is a complex conjugate of (25.8). Therefore 

(X) = j ij) (x) X*ty* (x) dx 

Finally, using the Hermiticity of X, that is, Eq. (25.3), we obtain 
the required expression for the mean value of X: 

(X) = j (x) ^ (x) dx (25.19) 

Thus, in order to calculate the mean value of X in a state \|) (:r), 
it is not necessary to know the eigenvalues of X, since it is sufficient 
to calculate the integral (25.19). 

If state (x) is not a “pure” eigenstate of X, each measurement 
puts the particle in another state. However, given a sufficient number 
of particles in the same initial state, if we consecutively carry out 
the measurements on each one, we can obtain the value of (X) in the 
initial state up to any accuracy. This value can always be reproduced, 
provided, of course, that the measurements are not carried out on 
particles already once measured, but on a “fresh” batch of particles 
in the same state as that which was obtained prior to the first series 
of measurements. 


Proof of the Uncertainty Relations for <A.r) (A p). We define 



<Ax) a = <(x — x 0 ) a > 

(25.20a) 

andj 

(&Px) 2 =((Px — Px o) 2 > 

(25.206) 

To 

simplify the notation we put x 0 

= 0 and p x0 = 0. Let us 


consider the following integral of an essentially positive quantity: 
7=^ | (ax + ibp x ) il? \ 2 dx^0 
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where a and b are real numbers. Expanding the integrand, we obtain 
/ = a 2 j yp*x 2 tydx + b 2 j (p£\()*) (p x ty)dx 

— iab J (xty) (pity*) dx-\-iab J (p x \ J)) (n|)*) dx 

We take advantage of the Hermiticity of p x and transform the 
terms involving in such a way that there appears the commu¬ 

tator pyX — xp x1 which from (24.17) is equal to hli. The integral 
must remain positive at all values of a and 6, which is possible only 
if the factors of a 2 , 6 2 , and ab satisfy the inequality 

dx j ^ | dx^^h 2 | dx = h 2 

After extracting the square root, we obtain 

<A*)<A Px )>\ (25.21) 

which is the lowest estimate of the inaccuracies (Ax) and < Ap x ). 

It is apparent from the reasoning that (25.21) is valid for any 
pair of Hermitian operators, provided their commutator is a number. 


EXERCISES 


1. The mean values of a certain quantity belonging to a state which 
is at the same time the eigenstate of the operator corresponding to it, coin¬ 
cide with the eigenvalues of that quantity. Show that the eigenvalues of 
the square of angular momentum are equal to h 2 l (l + 1). 

Solution. We construct the square of angular momentum: 

m 2 = 

and find the mean value of both sides of the equation: 

(M 2 ) = (M*) + (Ml) + (Ml) 

Next we make use of the fact that the mean value of the square of any 
angular momentum projection is the same: 

The mean value of the square of the angular momentum projection 
can be determined from the fact that all its projections are equiprobable: 


(M?) = 


1 

21 +1 


2 2TT 

- I 


1 ( 1 + 1 ) (21 + 1 ) 
3 
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But since all the considered states with various projections M z are eigen¬ 
states with respect to Af 2 , we obtain 
M 2 = (A/ 2 ) = hH (l + 1) 

2. Determine the energy eigenvalues of a quantum symmetric top* 
(An example of such a top is an ammonia molecule, which has the shape of 
a pyramid whose base is a regular trigon.) 

Solution . The energy of a symmetric top expressed in terms of the angular 
momentum projections is (see (9.17)): 


& = 27 ; + Mf) + 2 ^ Ml 

Passing to M 2 , we have 

Substituting the eigenvalues of the angular momentum and its projections* 
we finally obtain 


r, K - 2 r7 I21 . h 2 k 2 h 2 . h 2 k 2 /I 1 \ 


26 


TRANSFORMATION 
OF INDEPENDENT VARIABLES 

In classical mechanics we saw that its laws can be formulated in 
such a way as to involve the coordinates and linear momenta in 
equations symmetrically (Sec. 10). In quantum mechanics both 
quantities do not exist simultaneously. It is, however, legitimate 
to ask how to effect the transformation from coordinates to momenta, 
assuming the latter to be independent variables, or, in other words, 
how to introduce the momenta into the equations in place of the 
coordinates. We may take some other system of variables in place 
of the momenta. The only restriction is that all these variables exist 
in one and the same state. 

Matrix Representation of Operators. Suppose we have an equation 

A 

for the eigenfunctions of an operator k written in terms of variables x r 
in which we must pass to the independent variables v. For the initial 
equations we take 

Xty x) = taj) (k, x) 

VlJ) (v, x) = Vl|) (v, x) 


(26.1) 

(26.2) 



Quantum mechanics 319» 


In form they resemble (24.12) and (24.13), but now the operators 31 
and v do not commute. 

We expand an eigenfunction of X in a series of eigenfunctions of v: 

(X, x) = 2 (v, x) (26.3) 

V 

where the expansion coefficients c v are given by the general formu¬ 
la (25.11a): 

c v = j (v, x) (X, x) dx (26.4) 

Substituting this expansion into Eq. (26.1), we premultiply both 
sides of the equation by ty* (v\ x) and integrate over x. In the right' 
hand side, by virtue of (25.10a), we have 

j (v', x) 2 x)dx = 2 c v j ■*l 3 *( v '. s)t|j(v, x ) dx 

V V 



— C V ' 


and in the left-hand side we introduce the notation 


^v'v = 


j 


ty* (v', x) lij) (v, x) dx 


(26.5> 


In addition, we change the summation index from v to v'. Then 
Eq. (26.1) reduces to the form 

2 ^vv'^v' = Xc v (26.6) 

v' x 

In this form it resembles Eq. (9.9) for finding the principal values 
of the inertia tensor. Instead of the quantity with two indices, I a 3 , 
we have here another quantity, or rather a set of values, A, VV '. This 
set is more conveniently written as an array 


r ^n ^12 • 
^21 ^22 • 


. 3w V ' 

. X»2v / 


3tvv' = 


^Vl ^V2 • ■ 


(26.7> 


which is called a matrix. The individual elements of X VV ' are called 
matrix elements . 

Thus, in most general form the operator X is represented in terms 
of the arbitrary variables v as a matrix (26.7). It will be shown fur¬ 
ther on that the initial form of the operation of the operator X* 
(26.1), can also be reduced to the form (26.6). 
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If (26.6) serves for finding the same eigenvalues k as (26.1), it is 
natural to understand the coefficients c v appearing in (26.6) as the 

eigenvalues of k in terms of the variables v, which can be written 
down as follows: 

c v = iJ) (X, v) (26.8) 

We have thus effected as it were a transition to a new system of 
coordinate axes in Hilbert space. Equation (26.1) defined a system 
of eigenvectors of k ( x ), that is ij) ( k , x), whereas (26.6) yields 
a system of eigenvectors c v = \\> (k> v) of k (v) = k vv >. Here ^ vv ' 
must be seen as the whole array of matrix elements, the matrix (26.7). 

The condition for the Hermiticity of k vv ' can be rewritten in 
matrix notation very simply. Expressing this condition in the 
form (25.3), and substituting % = ip (v, x) and ij? = ij? (v', x) into 
it, we obtain 

^ if)* (v, x) Aa|? (v', x) dx — J (v', x) (v, x) dx 

= (J ty* (v', x) kty(v, x)dx)* 

or, using the general notation (26.5) for a matrix element, 

A,w' = ^v'v (26.9) 

In this form the condition for the Hermiticity of an operator 
closely resembles the symmetry condition for a tensor. But since 
in Hilbert space vectors are complex, a complex conjugation must 
be performed together with the commutation of indices. 

Let us now show how to write in matrix form the result of a succes- 

A 

sive application of two operators k and r\ to some vector c v . The 
operation of the first operator is written in the form 

= 2 ^-v'v^v (26.10a) 

where a v > is a new vector (cf. in Section 9: M a = 7 a p(Op, where cop 
is the angular velocity vector and M a is the angular momentum 
vector). Then the application of the next operator t] to a v > should 
be represented in matrix form as 

6^ == 2 == 2 2 ^Im’V'^v'v^v == 2 (2 ^ln-v'^v'v) c \ (26.106) 

v' V' V V V' 

We see from this that the result of the successive application of 
two operators to the vector c v is the matrix 

(*l^)nv = 2 “ 2 Cn^)|iv c v 


( 26 . 11 ) 
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The obtained formula expresses the rule of matrix multiplication. 
It corresponds to the successive operation of two operators, r) and X. 
From formula (26.11) it can be seen that, like the application of 
operators, multiplication of matrices is noncommutative: 

(^ T l)nv = 2 ^HV' T lv'V = 7^ = 2 = ('n^')n.V (26.12) 

V' V' 

The multiplication of a vector by a number can always be repre¬ 
sented in the form 

ac v = 2 a8 VV '<V (26.13) 

V' 

where 8 VV ' is a unit matrix in Hilbert space. 

With the help of the matrix 8 VV ', the commutation relation for 
any two operators which yield a number as a result of the commuta¬ 
tion, as for example p x and x, has the form 

2 ((P*W (*)v'v — (*W (Px)v'v) = 4 6 nv (26.14) 

V' 

If the commutation yields a new operator, as in the case of the 
angular momentum projections, in matrix form it is written as 
follows: 


2 ((MX,' (M U ) V ' V - (M y )^ (M,) v . v ) = ih (MX, (26.15) 

The Diagonal Form of a Matrix. Formula (26.4) can be used to 
represent the transition to the eigenfunction in the variable v, 
that is yp (X, v) = c v , in the following symmetrical way: 

yp(X, v)= j ( v » #)il>(k, x) dx (26.16) 

This is the general transformation formula for an independent 
variable. But it can also be given a different meaning if we put 
v = X '. In other words, for v = X\ Eq. (26.16) undergoes a transition 
to a set of variables that are themselves eigenvalues of the given 
operator: 

yp (V, X) = j \|)* (X', x ) yp ( X , x) dx (26.17) 

But by virtue of the orthogonality of the eigenfunctions of (25.10a), 
we obtain 

yp(X', ^) = 8 vx (26.18) 

The eigenfunction of an operator in variables that is a set of its 
eigenvalues is simply a 6 matrix. In these variables the operator 
itself appears in a very simple form. This can be seen directly from 


21-0452 
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Eq. (26.6). Substituting the obtained eigenfunction c % = 6^' (v=X) 
into it, we find that a matrix representing an operator in terms of its 
own variables retains only the terms that occur along the principal 
diagonal in the array (26.7): 

(X)i^ = XS^ (26.19) 

In other words, it is said that the operator has been reduced to 
diagonal form . For two operators to be reduceable to diagonal form 
simultaneously, that is, in the same state, they must commutate. 
in diagonal form two matrices always commutate, since if X^ = 

■— and t)vb —■ 

(^)n8 '— 1 Xr] 2 ^|1V^V8- 

V 

is equal to 

6ve®nv ■—■ r|X6(^e *** 
v 

But the commutativity of two matrices cannot depend on the set 
of variables in which they are expressed. As pointed out, a transition 
to another set of variables represents a rotation of the coordinate 
axes in Hilbert space. But if some matrix, in the present case one 
equal to the commutator of two other matrices, is not equal to zero 
in one system of axes, no rotation can make it vanish in another 
system. For two matrices to commutate in one coordinate system 
they must commutate in all systems. In particular, the respective 
operators should commutate not only in matrix form but in coordi¬ 
nate representation as well. 

Then each of the matrix indices \i , v, e in the expressions just 
written corresponds to a certain set of simultaneous eigenvalues 
of the operators X and tj. This is the difference between these expres¬ 
sions and formula (26.19), which refers only to the eigenvalues of 
one operator X.' 

The variables may assume either a discrete or a continuous set of 
values. For example, rotation angles vary continuously, whereas 
the eigenvalues of the angular momentum square and its projection 
vary discretely. It is therefore desirable to give such a meaning 
to the transformation formulas that would make them equally 
applicable to a discrete and a continuous set of variables. 

Suppose the variable X varies continuously. It was pointed out in 
Section 24 that the eigenfunction of an operator in terms of its 
variable (in this case x) is equal to zero everywhere except at some 
point x = x' corresponding to the eigenvalue. Hence, if the eigen¬ 
values X form a continuous set, then 

i|)(X', X)= j z)<l>(^, x)dx~ 0 at X*=£X (26.20) 
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which agrees with the orthogonality condition (25.6). But if in 
the case of a discrete set of eigenvalues A, at A, = k f the integral 
(26.20) is equal to unity, for a continuous set the situation changes. 
To investigate this case, we multiply both sides of (26.20) by some 
function c (k) and integrate over all allowed values of k and, further¬ 
more, interchange the integrations over k and over x on the right 
to get 

j \|)(A,' f A,) c (k) dA,= j yp* (A/, x) dx j c(A,) ^(A,, x) dk (26.21) 

Then the integral over A, on the right t that is J c(A,) \|)(A, t x) dk % 

can be treated as a generalization of the series (25.8) for a continuous 
spectrum of A,: 

(a:) = j c (X) yp (k 9 x) dk (26.22) 

We now state the basic requirement with respect to functions of 
a continuous spectrum: the expansion coefficients in the series (25.8) 
and the integrand c (A,) in the expansion (26.22)Jmust be expressed 
completely analogously, that is (see (25.11a)) 

c (X) = j ^*[(A*,fa:) ip (#) dx (26.23) 

But it is apparent from this equation that the expression appearing 
in the right-hand side of (26.21) is nothing but c (A/):* 

j x ) d x \ j * (A*) ^ (A,, x) d}\ 

j yp* (k\ x) ip (x) dx*=c (A/) (26.24) 

Thus, the eigenfunction ip (X', k) possesses the remarkable prop¬ 
erty that in integration it simply replaces the argument of the 
function it multiplies in the integrand: 

j X)'c(%) dl — c(\') (26.25) 

If we wish to preserve the analogy with the discrete spectrum of k r 
it is convenient to introduce in place of the 8 XX ' matrix a similar 
notation, the 8 function. Denoting yp (A/, A,) = 8 (A/ — A,), we write 

j 6 (V — X) c (X) dX = c (X r ) (26.26) 

Taking c (k) = 1, we find from the preceding equation that 

JS(r-X)dX=l (26.27) 

21 * 
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We have thus obtained the two basic properties of the 8 function 
introduced by Dirac: it is equal to zero everywhere except the point 
where its argument is zero, and at that point it becomes infinite 
in such a way that its integral is equal to unity. 

The normalization condition in a continuous spectrum is written 
with the help of the 6 function in the following way, generalizing 
(26.20): 

j \|)*(A/, ^^(A,, x) dx = 6 (A' — A) (26.28) 

We thus see that, although the integral of | \|) (A, x) | 2 becomes 
infinite, the normalization of functions in a continuous spectrum 
involves no difficulties. The normalization should be to the 8 function, 
not to unity as in the discrete spectrum. 

Differentiating both sides of Eq. (26.26) with respect to A', we 
find that 


J (sr 6 (V - x ' ) )‘ , M ‘ tt - 2 $ 2 < 26 - 29 > 

If necessary, this formula may be differentiated as many times 
as there are derivatives of c (A/). Thus, the 8 function under the 
integral sign can be differentiated as many times as necessary. 

Actually, we have already encountered the 6 function in discussing 
the distribution of a charge (that is, a function of its density) for 
the case of a point charge. Then, too, the density integral over the 
volume was equal to a finite quantity, as in formula (26.27). 

The Transformation to Momentum Representation. As an example 
let us examine the form quantum mechanical equations take if the 
projections rather than the coordinates are taken as the independent 
variables. Note, firstly, that in coordinate representation, that is, 
with x , y, z as independent variables, quantum mechanical equations 
can also be given the general matrix form (26.6). For that the opera¬ 
tor A (x) should be represented in the form X (x') 6(x — x f ) and 
integration should be introduced instead of summation. The deriva¬ 
tives involved in the operator X (:r) are replaced by the corresponding 
derivatives of the 8 function, as in Eq. (26.29): we must go over from 
dtyldx to (doj -)!dx') b(x' — x). 

Although it would seem that the coordinate form of writing quan¬ 
tum mechanical equations is simpler than all others, in many cases 
other representations may offer substantial advantages, which will 
be made clear from subsequent applications. Werner Heisenberg, 
who obtained the basic equations of quantum mechanics independ¬ 
ently of Schrodinger, from the outset employed explicit matrix 
representation rather than coordinate representation. 
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Let us now find the position multiplication operator in momentum 
representation. For that we interchange X and v in (26.16) to get 

ij)(v, X) = | (X, x) 'll? (v, x) dx (26.30) 

We have obtained the eigenfunction of v in terms of X variables. 
But from a comparison with (26.16) it is apparent that if (v, X) is 
the complex conjugate of the function if (X, v): 

^ (v, 1) = (K, V) (26.31) 

It is apparent from this how the eigenfunction of the position 
operator is expressed in terms of the momentum variable: if 
\|) ( p Xf x) = e'Vx*! 11 , then ( x , p x ) = e~^x xlh . Since the eigenvalue 
equation for position is 

ill? (x, p x ) = an|3 (x, p x ) 
in the momentum variables we must put 2 

x= —5-5^- (26.32) 

i dp x V 

In momentum representation the kinetic-energy operator becomes 
a numerical factor: 

f = 2 !j(/*+ /* + /*) ( 26 - 33 ) 

However, the potential energy operator may prove to be more cum¬ 
bersome than in coordinate representation, that is, it does not have 
the form of a matrix multiplied by a 6 function. 

Let us find the potential energy operator for the concrete case of 
the Coulomb field. For that we must first normalize the eigenfunction 
of x ) a continuous spectrum. Recall the Fourier integral 

theorem presented in Section 19: if two functions are connected by 
an integral relationship 

oo 

/ (tt) = (2 - |i — J e iux g(*)dx 

— OO 

then the inverse transformation from / ( u) to g (x) is written as 

oo 

— oo 

These equations should be compared with (26.22) and (26.23). 
In (26.22) we must substitute p x for X and the fraction e i P x x/h /(2nh) 1/2 


2 Such a simple transition is possible only in this case. 
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for ij) (X, x). It is then apparent that 

— oo — oo 

(26.34) 

already with the correct normalization, since it was precisely (26.22) 
and (26.23) that defined the normalization condition. 

Thus, to find the matrix element of the Coulomb potential energy 
we must compute the following integral: 

(Tlrwl , ' w >*' v ,26 - 35) 


In it, all components of the momentum vectors p' and p appear as 
each of the indices v' and v. 

The integral is best found in the following manner. We determine 
the “potential” cp of a certain “charge” distribution equal to p = 
= ( 2nh )~ 3 £*(p-p') r /\ In accordance with Section 16, (p satisfies 
the equation 

V 2 ®=- — — e *(p-p') r lh 

v V (2jlfc)3 * 


The potential of a charge element p dV is dq> = (p/r) dV. Then 
the potential of the whole charge at point r = 0 is expressed precisely 
by formula (26.35). But as can be directly observed, the solution 
of the potential equation is represented as 


1 e t(p-p')r//i 

^ 2ji z h (p — p') 2 


(26.36) 


which can be verified by substituting (26.36) into the Poisson equa¬ 
tion. We obtain expression (26.35) if we substitute r = 0. Thus, 
the matrix element of r _1 looks like this: 



1 

2ji 2 h (p—p') 1 


(26.37) 


and the Schrodinger integral equation for the motion of a particle 
in a Coulomb field written in terms of momentum variables has 
the form 


PI 

2m 


♦ («. P) - m J t< (p- p y : - E * < £ . p) (26.38) 


In general, if they are expressed in terms of arbitrary variables, 
all quantum mechanical equations are integral equations. A special 
feature of the coordinate representation is that it leads to 6' functions 
for momentum component operators, and the integral equations 
therefore transform into differential equations. In any case, the 
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representation of an operator with the help of the 6' function, the 
derivative of the 6 function, cannot be considered diagonal: such 
a function is other than zero in a domain infinitely close to the zero 
value of the argument, but never precisely where the argument 
actually is zero. 

The expression (26.32) for the coordinate in terms of the momentum 
variable is valid, because the momentum and the coordinate are 
defined in the same interval: they vary continuously from — oo to oo. 
But an angular coordinate—azimuth, for example—varies only 
between 0 and 2 ji. This imposes the condition of single-valuedness 
on the wave function which, as applied to the present case, takes 
the form 

(P*. <P) = t (P<p» <P + 2jt) (26.39) 

Substituting (p<p, qp) = we find = hk . 

We obtained this result before, using the general condition of the 
single-valuedness of the wave function in its dependence on the 
spatial coordinates a;, y, z, whence it followed that l and k are 
integers. The example (26.39) was used to show how the restricted 
nature of the variation interval of the coordinate qp yields a discrete 
spectrum for the corresponding momentum p {v . As a consequence, 
the multiplication operator of angle qp in terms of the variables p {p 
does not have a simple form similar to (26.32) (see Exercise 1). 

In other words, this can be expressed as follows: the mathematical 
differential notation of operator p x (or p {v ) does not effectively define 
it as long as the boundary conditions imposed on the eigenfunction 
are not stated. Unlike differential representation, the integral repre¬ 
sentation of an operator is complete and therefore sometimes 
preferable, ^„ 

Unitary Transformations. We shall examine some general properties 
of transformations from one system of independent variables to 
another. First, let us find the inverse transformation of (26.16), 
that x is, fromjthe independent variable v to the variable x : **• *’• 

Ij) (X, x) aa j ip* (x, Jv) ip (A,,|Jv) dv (26.40) 

The integral form of notation has been used here, but as was shown 
before, it is in principle no different from notation with the help 
of a sum for a discrete spectrum. With the help of the relationship 
(26.31) we eliminate the complex conjugate functions, after which 
(26.16) and (26.40) take the following form: 

\f) (5i, v) ^ \j? (A,, x) ^ (a;, v) dx 
ip (A, x)= j^(A, v)i|)(v, x)d\ 


(26.41a) 

(26.416) 
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The obtained transformations are in form analogous to the matrix 
multiplication (26.12): it is immaterial whether the indices appear 
in the subscripts or the arguments of the two variables, or whether 
we perform a summation or integration with respect to a matrix 
index. 

We can thus say that we have determined the transformation 
matrix from variables x tc variables v: 

v) (26.42a) 

as well as the inverse transformation matrix, which is conventionally 
written as U vx with the exponent —1: 

^vx=^(v, x) (26.426) 

But then we see from Eq. (26.31) that between the matrices of 

the direct and inverse transformations there exists the relationship 

U~ x l = u^ (26.43) 

The transition to complex conjugate elements with a simultaneous 

interchange of indices is called Hermitian conjugation . A little cross 
is used instead of an asterisk to distinguish Hermitian conjugates 
from complex conjugates: 

UZ X (26.44) 

Applying this notation to the U matrix, we rewrite (26.43) as 
follows: 

U^ = u+ X (26.45) 

In other words, a matrix that is the inverse of t/ xv is at the same 
time Hermitian conjugate to it. Such matrices are called unitary 
matrices (we recall that the elements of a Hermitian matrix satisfy 
the relationship = A ae ). 

From the definition of an inverse matrix we have 

j t/ _1 (v', x) U (x, v) dx = 6 VV ' (26.46) 

so that a unitary matrix satisfies the relationship 

[ U+ (v', x) U (x, v) dx = 6 VV ' (26.47) 

J 

which is equivalent to the orthogonality condition of the eigen- 

A 

functions of the operator v in terms of the variables x . 

Transformation of an operator to another representation is also 
performed with the help of a unitary matrix U . Let us show this 
on the basis of (26.5). Note, first, that the operator X in the integrand 
is expressed in terms of the variables x, and for that reason alone 
there is only one integration. In general form operators are expressed 
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by matrices which, as pointed out, involve 6' functions, making 
it possible to reduce integration to differentiation only in the x 
representation. Therefore, the more general notation of (26.5), in 
which x need not necessarily be interpreted as a coordinate, is 

A, V ' V = j j* ty* (v', x) Xx'xty (v, x) dx dx (26.48) 


With the help of (26.31), (26.42a), and (26.42b), we give the formula 
for transforming an operator to new variables of the form 


^ ^ xv dx dx 

= I [ U+' x ‘’^ x ‘'JJxv dx dx f 


(26.49) 


that is, involving the product of three matrices, t/ + , X, and U. 

The products of matrices multiplied by vectors or by one another 
are frequently written without explicitly denoting the indices. 
Then the transformation formulas from one set of variables to another 
for wave functions are expressed as 

V = *;U (26.50) 

and for opeiators as 

V = U + XU (26.51) 

while the unitary condition is 


U + U = 1 

where the unity in the right-hand side actually denotes a 6 matrix 
or 6 function. 

The unitarity of the transformation matrix U is similar to the 
same condition for the cosines of the rotation angles between old 
and new Cartesian coordinate axes. If the direct transformations 
are written as x a = A a pXp, the inverse transformations have the 
form xp = Ap a Xa (A a p is the cosine of the angle between the new 
axis with the label a and the old axis with the label p). But inverse 
transformations can be expressed with the help of the inverse matrix 
Aa$, so that Aal = Ap a . In Hilbert space, where vectors are 
complex, complex conjugation is performed in addition to inter¬ 
changing the indices. 

Determination of the eigenvalues of an operator represents in 
effect the reduction of a second-order “surface” in Hilbert space to 
the principal axes. Two surfaces cannot be reduced to the principal 
axes simultaneously because the respective principal axes of these 
surfaces are differently oriented. Such is the geometric interpretation 
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of the provision that two operators do not have eigenvalues in the 
same state, that is, the corresponding quantities cannot exist simul¬ 
taneously. 


EXERCISES 

1. Find the matrix elements of the multiplication operator of the 
rotation angle <p in terms of the variables p v = hk. 

Answer . From Eq. (26.5) we obtain 

Wh*h “ Jc' — k f ° r ^ * 

= n for W = k 

2. Show that a unitary operator U can be represented in the form 

where O is a certain Hermitian operator. 

Solution . If 

U = c i<5> 

then 

U-i = e~ i<5> 

Let us now show that U~ x = U + . For this, note, firstly, that a transfor¬ 
mation to a complex conjugate operator in any case means a reversal of the 
sign of <D in the exponent. Further, we expand the exponent in a series: 

£M=1-«D+-^-<ds—^<D3+... 

We now take one of the terms of the series, the third, for example, and 
write it with the help of matrix indices, agreeing that we make use of the 

summation condition and know the Hermiticity of <D + (<D*p = <D a p): 

= (® + )2 P 

Collecting the expansion terms in exponential form, we obtain 
U~ l = U* 
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27 

OPERATORS IN MATRIX REPRESENTATION 

The Time Dependence of Matrix Elements. The results obtained in 
the preceding sections make it possible to find the solutions of certain 
important quantum mechanical problems. Let us first show that it is 
possible to write equations of motion for operators similar to the 
equations of motion of Newtonian mechanics. 

We write the matrix element of a certain operator in terms of 
energy variables, that is 

Ke-e = j V ( E' , x) ( E , x) dx (27.1) 

The eigenfunctions of the Hamiltonian in the integrand, together 
with the time dependent factor, have the form (23.20): 

o|)(£,» = e- iEt ^ 0 (E, x) (27.2) 

where the zero subscript of the wave function denotes that the time 
factor has been separated. Then the matrix element (27.1) takes 
the form 

% E , E = e -i(s-E')*/h J ^ (£■', x ) (E, x) dx (27.3) 

A matrix element without the time factor is said to be written 
in the Schrodinger representation , and with the factor, in Heisenberg 
representation . 

We now differentiate the matrix element with respect to time* 
assuming for generality that X may also involve an explicit time 
dependence: 

SL^e-HE-E'Wh. 

dt 

X j ro(E\ X) to (E, x)dx (27.4) 

Returning to the matrix element (27.1), we rewrite Eq. (27.4) 
as follows: 

dX T?fJ7> dXrptrp j 

~^ = ^T-+t(E'^e-^eE) (27.5 a) 

We know from the preceding section (see Eq. (26.19)) that an 
operator written in terms of its own variables has diagonal form. 
Hence 


StE'E 53 E f &E'E ' r 


$f>E m E == $E b eE 


(27.6) 
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But then (27.5a) can be written using the commutator for the 
Hamiltonian $8 and the operator X: 


dX 


E'E 


dX 


'E'E 


dt 


dt 


dX 


E'E 


dt 


— (E , 6e*e''^e"e — Xe'E"$e"eE) 
Y {{38X)e’e — ( X$8)e'e ) 


(27.7) 


A matrix equation holds in any system of variables, notably 
in coordinate form: 


dX 


dX 


dt 


dt 


^{S87 1 —MW) 


(27.8) 


Note that Eq. (27.8) has a classical analogue. If there is a certain 
dynamical quantity X (p, g; £), its total time derivative is 

dX dX . dX dp . dX dq 

dt dt a* ** n 


dp dt 1 dq dt 


(27.9) 


Substituting p and q from Hamilton’s equations (10.6a) and (10.66), 
we obtain 


dX _ dX dX d&e dX d&€ 
dt dt dq dp dp dq 


(27.10) 


It will be shown in Section 31 that the expression (27.10) develops 
from (27.8) in the limiting transition to classical mechanics. The 

. dXdofO dXd&€ . n j n • l 7 . 

expression — —--—— is called the Poisson bracket . 

r dq dp dp dq 

If the quantity X does not explicitly depend on time and its Pois¬ 
son bracket is not zero, then X = 0, that is X is an integral of the 
motion. Similarly, if the operator X does not involve t explicitly 
and is commutative with the Hamiltonian, then it is called the 
quantum integral of the motion . The corresponding quantity X exists 
in the given state simultaneously with the energy of the system. 

For example, the Hamiltonian of a particle in a central Coulomb 
field has the form (see (24.32)) 


h 2 1 d 2^1 M 2 Ze 2 

2 m r 2 dr ^ dr * 2mr 2 r 


(27.11) 


The operator of the angular momentum square involves only 
differentiation with respect to the angles and therefore commutes 
with the Hamiltonian. Hence, it is conserved together with the 
energy. The same holds for the operator of the angular momentum 
projection M z = (h/i) ( 6 / 69 ). Thus, within the framework permitted 
by quantum mechanics, the same quantities are conserved together 
with the energy as in classical mechanics, where all three components 
of angular momentum were conserved. Only in quantum mechanics, 
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instead of the three components, we have the square of the angular 
momentum and its projection. Conservation of the three angular 
momentum projections corresponds to a path of motion lying in 
a plane perpendicular to the angular momentum, but in quantum 
mechanics there are no paths. 

Let us now find the quantum analogue of the equations of motion, 
that is, calculate the total time derivatives of the coordinate and 
momentum. We have 


dr 

It 




(27.12) 


because the potential energy operator depends only upon the coordi¬ 
nate and commutes with r. To find the obtained commutation rela¬ 
tion, let us study it for one component of r, for example, x. For 
this component 

p z x — xp 2 = plx — xpl 


since the squares of the other two momentum components commute 
with x. Further, we add to and subtract from the commutator the 
same term Pxxp x . Collecting like terms, we obtain 

Px {Px X X Px) “f" {Px^ £Px) Px = 2 ~ Px 

Commutation relations for the other components of the radius 
vector are found similarly. Substituting them into (27.12), we 
finally get 


Px 



(27.13) 


Hence, the velocity and momentum operators are connected by the 
same relationship as classical quantities. 

Let us determine the derivative of the momentum operator: 


Px = 


6U 
dx ’ 


6U 

dr 


(27.14) 


where the result of Exercise 2, Section 24, has been used. 

It is natural to call the operator in the right-hand side of 
Eqs. (27.14) the “force” operator. Thus, we have obtained for oper¬ 
ators the same equations of motion, (27.13) and (27.14), as for 
classical quantities. This assertion is known as the correspondence 
principle. 

Forming mean values of Eqs. (27.13) and (27.14) according to 
the rule (25.19), we find that quantum mechanical means satisfy 
the classical equations of motion. 
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The Linear Harmonic Oscillator. Let us apply Eqs. (27.13) and 
(27.14) to the problem on determining the energy eigenvalues of 
a linear harmonic oscillator. We know from Eq. (7.31) that the 
energy of a separate linear harmonic oscillator of unit mass is 


E 


VL 

2 


C0 2 <? 2 

~ 2 ~ 


Suppose the mass of the oscillator is m . Then its x coordinate 
can be measured in conventional length units, and its momentum p 
in g-cm-s" 1 units. 

Expressing the velocity in terms of the momentum according 

to the formula x = p/m, we represent the Hamiltonian of the oscil¬ 
lator in the form 


m= -^r+—~ 


(27.15) 


In order to determine the energy eigenvalues, (27.15) must be 
treated as an operator equation: 




(27.16) 


In this section, we shall solve the problem with energy as the 
independent variable. At first glance this may appear to be much 
more difficult than the coordinate representation, where we at least 
know the form of the operators: p = (h/i) (dldx). Actually though, 
in the case of an oscillator the matrix form possesses a number of 
advantages, since it enables an algebraic solution. We shall solve 
the oscillator problem in coordinate representation in the next 
section; this will require an analysis of the differential equation. 

We average equation (27.16) over an arbitrary state of the oscil¬ 
lator: 

(£) = -^-+ m ^ x *> (27.17) 

It is apparent that the mean values of the positive quantities (p 2 ) 
and (x 2 ) are at least not negative, so that the mean value of the 
energy is not negative either. If the averaging is performed over 
the eigenstate of the Hamiltonian, then the mean value equals the 
eigenvalue, therefore the energy eigenvalues are not negative. 

We now write the equations of motion, (27.13) and (27.14), for 
the oscillator, that is, putting the potential energy U = mci> 2 x 2 /2. 
This'; yields 


x = p/m 


p = — mci) 2 x 


(27.18) 

(27.19) 
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We take the matrix elements, in energy representation, of both 
sides of these equations, that is, we replace each operator X appearing 
in the equation with its matrix element Xe'e> Since none of the 
operators in our problem explicitly depend upon time, we obtain 


from (27.5) 

i E , E = -L (£' _ E) Xe'e (27.5 b) 

Applying this formula to (27.18) and (27.19), we find 

— ( E '— E) %e'e = Pe'e (27.20) 

-jr- (E r — E) pe'e= — mozPxE'E (27.21) 

We solve the second equation for pe'e and substitute it into the 
first to get 

((£' - E) 2 - fc 2 co 2 ) x E ’E = 0 (27.22) 


Thus only such a matrix element xe*e is other than'zero'for which 
E ' — E = ±h(o. We see from this that the differences between 
neighbouring energy eigenvalues can be equal^only to ±h(o and 
no other value. On the other hand, we have established that the 
energy eigenvalues are not negative, so that by subtracting the 
quantity feco from some energy eigenvalue a sufficient number of 
times we inevitably arrive at some least energy eigenvalue E 0 . 
Therefore, in future we shall write the energy eigenvalue in the form 

E m « h(on + E* (27.23) 

Accordingly, instead of xe*e , we denote the matrix elements 
x ntn , where n' and n are positive integers. In this notation, in place 
of Eq. (27.22) we write a similar equation in which the labels^are 
not the energy values themselves but their numbers: 


(( n- — n) 2 —1) Xn 9 * = 0 (27.24) 

First of all, it is apparent that when the numbers are equal (»'f = 
= n), x n ' n = 0; hence the matrix x n ' n has no diagonal elements. 
We put n’ = n ± 1. Then the fijst term on the left in Eq. (27.24) 
vanishes, whence it follows that x n±1 n =/= 0. At all other n'|this 
factor is not zero, so that all the other matrix elementsjfare, like 
the diagonal elements, zero; thus, only x n±i n 0 . 

Let us now determine the nonzero matrix elements x n±1 n . For 
that we proceed from the commutation relation (24.17), which 
should also be rewritten in energy representation. In the right-hand 
side of Eq. (24.17) we have a number, so that only diagonal matrix 
elements result from the right-hand, and correspondingly left-hand, 
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sides. In order to determine the matrix elements of the left-hand 
side we must express the matrix elements of momentum in terms of 
the coordinate matrix elements. This is done with the help of the 
equation of motion (27.20), which shows that nonzero elements in 
p n ' n have the same labels as in matrix x n > n , that is p n±1 n . For these 
elements we obtain 

Pn+i n = imax n+l n , p n - x n = — imax^ n (27.25) 

We find the commutation relations for the left-hand side according 
to the general rules of matrix multiplication: 

Pn n+l'X'n+l n " 4 " Pn n—l^n—i n %tl n+iPn+l n X n n—lPn—i tl 

= — 2m&i | x n n+ i | 2 + 2mai \ x n I 2 = -j- (27.26a) 


But since x n n is Hermitian, x n n _ 1 = xJL x n , and therefore 
| x n n _! | 2 = | x n _ x n | 2 . Equation (27.26a) can also be rewritten 
in the form 


| %n n+l | 2 | ^n-l n | 2 


h 

2m, co 


(27.26 b) 


We see that each label in the first term on the left is greater by 
unity than the corresponding label in the second term. Furthermore, 
there are no terms with negative labels, since according to (27.23) 
n begins at zero. Therefore 


^01 I 


h 

2mco ’ 


|* i *| 2 = 2 


h 

2mco ’ 


| n+l | 2 = (n + 1) 


h 

2m co 


(27.27) 


Since | x n n+1 | 2 = | x n+1 n | 2 , we have determined the squares of 
the moduli of all nonzero matrix elements of the coordinate. 

We assume that the matrix elements themselves are real numbers 
and the value of their phase is not involved in any physical equa¬ 
tion. Then, from (27.25), the matrix elements of the momentum are 
purely imaginary numbers. It follows then from the Hermiticity of 
the momentum operator that p n > n = pnn* = — Pnn’- Making use 
of (27.25) and (27.27), we write both matrices, x n » n and p n > n , ex¬ 
plicitly 


n'n 



o yi 
V"i o 
0 V2 
0 0 


V2 0 ... 

o yi ... 

0 ... 
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Pn'n — 


*( 


mh(j> \ 1/2 
~2~ ) 


fo - 

-Vi 

0 

0 ... 


Vi 

0 

-V2 

0 ... 


0 

V2 

0 - 

-1/3 ... 

(27.28a) 

0 

v * * 

0 

V3 

0 ... 



It remains for us to compute the lowest energy E 0 (the energy of 
the ground state). It can be represented as the diagonal matrix 
element of SB with labels 0, 0. Expressing the energy matrix in terms 
of the squares of the coordinate and momentum matrices, we obtain 

tp <?in PoiPio i rnafixQiXiQ h(d , ha* _ hoy /r)r7 qq^\ 

E 0 = SB oo = ^- 1 - 2 - + i 27 - 2 ™) 

where we made use of (27.28a). The meaning of zero energy will 
be explained in the following section. 

The Density Matrix. In discussing the Stern-Gerlach experiment 
in Section 24, we pointed out that a primary beam of particles 
arbitrarily oriented in a magnetic field splits into 21 + 1 beams, 
whereas a beam with a definite value of the angular momentum pro¬ 
jection splits only in a field that is not parallel to the z axis, that 
is, not parallel to the field which caused the primary splitting. In the 
latter case it is said that a system is in a pure quantum state 3 char¬ 
acterized by a definite wave function. (An experiment performed 
on a system in pure state yields the same result when repeated.) 

How is one to characterize the state of the initial beam? We know 
the probability w h of the occurrence of every value of k: it is equal 
to (21 + l) -1 . In a pure state we must know not only the probabili¬ 
ties w h but also their amplitudes c h , in terms of which w h is expressed 
as | c h | a . Hence, in a pure state a system may be characterized by 
a wave function 

^ (*) = 2 (K *) 

k 

But for this the system must be closed, that is, it should not 
interact with anything. Only in that case does it have a definite 
Hamiltonian and its wave function \|) (x) satisfies the exact wave 
equation. 

In the Stern-Gerlach experiment, the particles (usually atoms 
of a metal) are produced by evaporation of a substance in a special 
oven. In such conditions an atom cannot be treated as a closed 
system, since it interacts with its surroundings. This interaction 

8 In our case a separate particle serves as the system. 

22—0452 
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is not strong enough to affect the angular momentum square of each 
atom, so that the value M 2 = h 2 l (l + 1) is the same for all the 
atoms (see Sec. 24); however, it creates conditions in which all 
values of k for the ejected atoms are equiprobable. Such a state, 
which has been subjected to an external 'action, is known as a 
mixture , as distinct from the pure state. 

We shall not consider mixtures specifically in connection with 
the Stern-Gerlach experiment, but in the most general case. Let the 
probability of the occurrence of the nth state of a system as a result 
of some external action be w n . To describe such a system it is conve¬ 
nient to introduce a special matrix 

p(x\ x) = 2 ( w . x ) (27.29) 

n 

called the density matrix of the system. 

Let us show how the mean values are calculated for a given mixture 
with the help of the density matrix. The general definition of the 
mean of a certain quantity, (X), is, as usual, 

(X) = 2 W n (X) n (27.30) 

n 

where (X) n is the mean of X over the nth state (cf. (25.18)). From 
(25.19) the mean value of X over the rath pure state is 

<^) n = j i|5*(rc, a:) Aa|) (tz, x) dx (27.31) 

We substitute this into (27.30) and interchange the order of sum¬ 
mation and integration. For this it is first convenient to represent 
X (x) in matrix form: 

X(x) = 6{x — x')X{x') = X(x\ x) (27.32) 

as was done in the preceding section. We recall that if X (x') involves 
a differentiation with respect to x\ then (27.32) is not written in 
diagonal form. Now, substituting (27.32) into the expression for 
the mean, we reduce it to the form 

(X) = | dx | dx 2 (rc> x') X ( x , x') Ij) ( n , x) 

n 

= £ dx j dx'X(x , x')p(x ', x) (27.33) 

in accordance with (27.29). 

This was the form in which we obtained the diagonal element of 
the product matrix Xp, that is j dx ' X (x , x’) p ( x' , x ), integrated 
over x. The sum of the diagonal elements of a matrix, A VV1 or the 
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integral over the diagonal elements A(x, x ), if the index varies 
continuously, is called the trace (Tr) of the matrix: 

Tr^ = 2^ vv (27.34) 

V 

Tr A = j dx A(x, x) (27.35) 

Expression (27.33) is then the trace of the product of the density 
matrix and the matrix of the averaged quantity: 

(k) = Tr (kp) (27.36) 

The significance of Eq. (27.36) is that it does not depend on the 
adopted coordinate representation. The value of the trace of a matrix 
is invariant under a unitary transformation, that is, with respect 
to a transformation to another representation. Let us show this 
with the help of the equations of the preceding section. We write 
Eq. (26.48) for an arbitrary operator A: 

A V ' V = | j **(v', x')A X ' X \p(v , x)dx r dx (27.37) 

and find the trace of both sides, interchanging on the right the inte¬ 
grations over x , x and the summation over v: 

Tr A = 2 Aw = j ^ dx’ dxA x * x 2 (v, x') ^ (v, x) (27.38) 

V V 

We now take advantage of the fact that ty* (v, x') = (#', v) 

and \f> (v, x) = \|)* ( x , v). Then the inner sum reduces to the form 

S (s, v ) ^ (s', v) (27.39) 

V 

But by virtue of the general condition of the orthogonality of wave 
functions this sum is 

2 (s, v)i|)(:r', v) = 6(:r — x') (27.40) 

V 

Here Eq. (26.28) was used, with the substitution of x for k and v 
for x. If the spectrum of v is discrete, the integration is replaced 
by a summation, but if only the spectrum of k (that is x) is contin¬ 
uous, the right-hand side of the equation is expressed in terms of 
a 6 function. With the help of (27.40) we obtain 

Tr^4= 2 Aw= j dx j dx' A XX '6(x — x’) = j dxA xx (27.41) 

V 

where the fundamental property of the 6 function was used. 

22 * 
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Thus, the expression of a mean quantity in terms of the trace of 
the density matrix is valid in any representation, as it should be for 
any physical quantity. 

Incidentally, we note an important property of the trace of a 
product matrix: 

2 (^)w=2 = 2 

V vn M-V 

= 2 = 2 (^)w (27.42a) 

(IV V 

(We have, as many times before, interchanged the indices with re¬ 
spect to which the summation was performed.) In abbreviated form 
we write 

Tr (AB) = Tr {BA) (27.426) 

The invariance of the trace in Hilbert space has an analogue in 
Euclidean space: the sum of the diagonal components of a tensor 
of rank 2, A aa , is a scalar, and it does not change in a rotation of 
the coordinate axes. 

With the help of the fundamental equation (27.36) it is simple 
to find the trace of the density matrix itself. For that it is sufficient 
to substitute X = 1 into the formula. We obtain 

Tr p = 1 (27.43) 

We now find the trace of the square of the density matrix, that 
is, the product of a multiplication of p by itself. According to the 
rule of matrix multiplication, 

(P X p)**' = j dx" p (X , x") p ( x" , x') (27.44) 

We substitute the expressions for the matrices p(x', x) and 
p(a:*, x) and interchange the summation and integration order. 
Making use of the orthogonality of the functions (n y x) and 
\|)* {n\ x *), we obtain 

j dx’ p (x, x’) p (x", x') 

s= 2 (ra, x)i])(n', x') j i|)*(«', x’)^(n, x“) dx* 

n, n' 

= 2 ^ ("» 
n 

To find the trace of this expression we must put x f = x and inte¬ 
grate over x. Thanks to normalization, every integral of-1 ( n , x)\ 2 
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is equal to unity, so that 
Tr (p X p) = 2 w l 

n 


(27.45) 


But the trace of the density matrix is, as we already know from 
(27.43), 

Trp = 2«’ n j ^*(«, x) ■>];(«, x)dx='2 1 w n = 1 (27.46) 


If the probability of some event does not become a certainty, 
then w n <1. Hence, the square of the probability is smaller than 
the probability itself: <Zw n . Comparing (27.45) with (27.46), 

we see that 

Tr (p X p) ^ Tr p (27.47a) 


The inequality may become an equality only in the case of a pure 
state, when one of the probabilities, for example w n , is equal to 
unity, and all w n ^ n = 0. Then 

Tr| ( P xp) = Trp = l (27.476) 


The degree to which Tr (p X p) approximates to Tr p can be treated 
as a measure of the purity of the state. 

Thus, the density matrix provides the fullest possible description 
of a mixture. 

Let us now find the equation of motion of the density matrix , which 
replaces the Schrodinger equation (23.11), —( h/i) (dty/dt) = Sity- 
We take the derivative of the density matrix with respect to time: 


But 


h dp (x r , x) 
i dt 


-s».(4 


h d\J)* (n, x') 


dt 


(re, x) + i|>* («, x') 


h (n, x) 

i dt 


h 

T "~df 




h 

i dt 


= —$e\ j> 


whence we obtain 

X) = T (*') P (*'• *) - se (x) p (*', x)) (27.48) 


Since this equation is in operator form, it is valid in any repre¬ 
sentation. 
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28 


SOME PROBLEMS 

IN COORDINATE REPRESENTATION 


In this section we shall obtain solutions of the wave equation for 
certain cases, which are in part illustrative and in part of an aux¬ 
iliary nature. Nevertheless, they help to reveal many important 
regularities. 

Problems involving boundary conditions imposed on the wave 
function in its dependence on the coordinates would be difficult to 
solve in any but the coordinate representation. The greater part 
of this section is devoted to such problems. In addition, a solution 
is presented of the linear harmonic oscillator problem with the help 
of the Schrodinger equation. Irrespective of the earlier obtained 
result in the energy representation, the solution of this problem in 
the coordinate representation is of major interest as an illustration 
of the computational methods of quantum mechanics. 

A Particle in a One-Dimensional, Infinitely Deep Potential Well. 
Suppose a particle is constrained to move in one dimension remain¬ 
ing in an interval of length a, so that 0 ^ x ^ a. We can imagine 



that at points x = 0 and x = a there are absolutely impenetrable 
walls which reflect the particle. A limitation of this type is repre¬ 
sented with the aid of the potential energy curve shown in Figure 30: 
U = oo at x < 0 and at x > a. We put U = 0 at 0 ^ x ^ a; this 
is the potential energy gauge. To leave the region 0 ^ x ^ a, a 
particle would have to perform an infinitely large quantity of 
work. Thus, the probability for the particle to be at x = 0 or x = a 
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is equal to zero. With the aid of (23.1), we obtain 

i|) (0) = i|) (a) = 0 (28.1) 

These boundary conditions may also be justified by means of a limit¬ 
ing transition from a well of finite depth to a well of infinite depth. 
This will be done later. 

Insofar as the potential energy is time independent, we can write 
the equation for the energy eigenvalues directly; this is Eq. (23.21). 
The motion is one-dimensional and, therefore, we must take the 
total derivative d 2 !dx 2 in place of V 2 * From this we have 

-&4S-= e ' f (M-2) 


We introduce the shortened notation 
2m E 


h 2 




so that the wave equation will be of the form 
**--2,1 


dx 2 


— X 2 l|) 


The solution to (28.4) is well known: 
= C x sin xx -f- C 2 cos xx 


(28.3) 

(28.4) 

(28.5) 


But from (28.1) ^ (0) = 0, so that the cosine term must be omitted 
by putting C 2 = 0. There remains 

= C 1 sin xx (28.6) 

We now substitute the second boundary condition 

(a) = C 1 sin xa = 0 (28.7) 

This is an equation in x. It has an infinite number of solutions: 

xa = nz i (28.8) 

where n is any nonzero integer: 

1 ^ n ^ oo (28.9) 


The value n = 0 is precluded, because at n = 0 the wave function 
vanishes everywhere (\|) = sin 0 = 0); hence | ij) | 2 = 0, and the 
particle simply does not exist anywhere (a trivial solution). 

Now substituting x from the definition (28.3) and solving (28.8) 
with respect to E, we find the expression for energy, that is, the 



344 


Fundamental laws 


energy spectrum for the problem being examined: 

(28.10) 

The boundary condition imposed on a wave function is just as 
essential for finding the energy spectrum as the wave equation 
itself. As is apparent from (28.10), it is valid not for all energy 
values but only for those that belong to a definite set of numbers 
characteristic of the given problem. Depending on the conditions, 
these numbers may form either a discrete series, as in the present 
problem, or a continuous set, as in the problem on the free motion of 
a particle. 

Indeed, in the free motion of a particle its wave function must 
remain finite everywhere. This condition is satisfied by the func¬ 
tion (23.3) at all real and positive energy values. In this case imagin¬ 
ary momentum would correspond to negative energies, and the 
coordinate dependence of the wave function would have the form 
of an exponential with a real exponent. But such a function becomes 
infinite at x = +oo or x = — oo. 

The solution of Eq. (23.21) for stationary states is always asso¬ 
ciated with finding the energy spectrum. In contrast with Bohr’s 
theory, where the discreteness of states was a necessary, but alien, 
appendage to classical motion, in quantum mechanics the very 
character of the motion determines the energy spectrum. This will 
be made especially apparent in the examples that follow. 

Let us now return to the wave function (28.6). It vanishes within 
the interval (0, a) (that is, except at its ends) n times. The number 
of zeros (nodes) of the wave function equals the number of the 
energy eigenvalue. 

This result is easily understood from the following reasoning. 
At n = 1, there is one sinusoidal half-wave in the interval (0, a); 
at n = 2, there is one wave; at n = 3 there are three half-waves, 
etc. Hence, the greater the value of n the smaller the de Broglie 
wavelength X. But energy is proportional to the square of momentum, 
that is, inversely proportional to the square of X, according to 
(22.2a). Hence, the smaller the X the greater the energy. This con¬ 
clusion holds, of course, not only for wave functions of a purely 
sinusoidal shape, though as a qualitative rather than an exact 
quantitative relationship: the more zeros, or nodes, the wave function 
has, the greater the energy. 

The least-energy state corresponds to a wave function which has 
no nodes anywhere within the interval. It is called the ground state , 
all the other states being termed excited . 

It remains to determine the coefficient C x in order to define the 
wave function completely. We shall find it from the normalization 


E n - 


Zl 2 h 2 


2m a 2 
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condition (23.17): 

C\ j* sin 2 kxcIx = CI ^ —— c ° s ^ 

0 0 
r2 / x sin \ | a _ C\a __ 4 

M 2 — 4^ ] |o ~'~2~ ~ 1 

The second term of the integrated expression becomes zero at both 
limits in accordance with (28.8). Thus 

Ci = (2/a) 112 (28.11) 

1>n = ( T ) Sin— (28.12) 

The wave function (28.12) is real. Therefore, from (23.19) the 
current (particle flux) in this state is zero. This can also be explained 
as follows. The wave function (28.12) separates into a sum of two 
exponentials. Together with the time factor e~ iEt/h , each such expo¬ 
nential represents the wave function of a free particle, (23.3), one 
of them corresponding to momentum p = hit, and the other to the 
same momentum but with opposite sign. Thus, state (28.12) represents 
a superposition of two states of opposite momentum, both states 
having equal amplitudes. 

The mean momentum of a particle moving in a potential well 
according to the laws of classical mechanics is equal to zero: it 
changes its sign in every reflection from the walls of the well. In this 
sense we can say that in the case of quantum motion the mean mo¬ 
mentum of a particle is also zero. The difference is that at every 
given instant classical momentum possesses a definite value, whereas 
the quantum momentum of a particle in a well has no such value: 
the wave function is represented as a sum of states with momenta 
of both signs. This corresponds to the uncertainty principle: since 
the coordinate of the particle is restricted to the limits 0 ^ x ^ a, 
its momentum cannot have an exact value. 

Note, furthermore, that in this particular problem of a rectangular 
well the square of the momentum is equal to /i 2 x a , because the 
uncertainty extends only to the sign of the momentum. The square 
of the momentum is in this case proportional to the energy. For 
a well of arbitrary shape the square of the momentum is also in¬ 
determinate. 

A Particle in a Three-Dimensional Infinitely Deep Potential Well. 
Let us now suppose that a particle is contained in a box whoso 
edges are a u a 2 , a s . Generalizing the boundary conditions] (28.1), 
we conclude that the wave function becomes zero on all the sides of 
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the box: 

iJj (0 , y, z) = (z, 0, z) = i|) (x, y, 0) 

— i|) (a u y, z) = i|) ( x , a 2 , z) = i|) (x, y, a 3 ) = 0 (28.13) 

The wave equation must now be written in three-dimensional 
form: 


k 2 / d 2 ty 

2m V dx 2 


<9 2 \J? 


5 2 i|) 

dz 2 




It is convenient to write the solution as follows: 


(28.14) 


= C sin Xj# sin x 2 z/ sin x 3 z (28.15) 

It is written only in terms of sines and not cosines so as to satisfy 
the first line of the boundary conditions (28.13). The quantities x 1? 
x 2 , x 3 are determined from the second line of the boundary condi¬ 
tions (28.13). The factors of (28.15) turn zero either at x = a lt or 
y = a 2 , or z = a 3 . In other words 

sin Xjaj = 0, x^ = n x n 

sin x 2 a 2 = 0, x 2 a 2 = n 2 n (28.16) 

sin x 3 a 3 = 0, x 3 a 3 = rc 3 Jt 


Here n x , n 2 , and n 3 are integers of which none are equal to zero 
(otherwise would be equal to zero over all the bt>x). 

We substitute (28.15) into (28.14) and take advantage of the fact 
that an equation of the form 

-^ 2 -sinx 1 o:= — x* sin x^ (28.17) 


holds for each term in (25.15), which yields 
V 2 H> = — (x 2 + x* + x 2 ) 

For Eq. (28.14) to be satisfied the energy must be related to x l3 
x 2 , and x 3 in the following way: 


h 2 




(28.18) 


Substituting x x , x 2 , x 3 from (28.16) into (28.18) t we obtain the 
energy eigenvalues 


En 


n 2 h 2 


J nin z n t — 2 m 

The ground state energy is 
n 2 h 2 


( n l I n 2 I A \ 
\ a\ •" a\ •" a\ ) 


(28.19) 


n 2 h 2 / 1 1 1 \ 


2m 


(28.20) 
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The value E = 0 is impossible by virtue of the uncertainty prin¬ 
ciple; a particle contained in a well of finite dimensions has no 
strictly defined momentum, notably zero momentum. Since p x = 
= ±fex, we see that A p x = 2 hn. Substituting A p x into the uncer¬ 
tainty relation (22.4), we find that the minimum value of x x is ji la x . 
But this corresponds precisely to Eq. (28.18) if we substitute x* = 
= n/ai (i = 1, 2, 3). 


Calculation of the Number of Possible States. To each value of the 
three numbers n l9 n 2 , n 3 there corresponds one possible particle state. 
Let us find the distribution of the number of possible states of a 
system over the values of n l9 n 2 , and n 3 for large numbers n l9 n 2 , n 3 . 

Numbers that are large in comparison with unity can be differ¬ 
entiated: the differential dn x denotes an interval of numbers that is 
small in comparison with n x but still includes many separate integral 
values of n lt In other words, 1 <^; dn ± n x . It is then clear that 
there are exactly dn x possible integral numbers in the interval dn l9 
and similarly in the intervals dn 2 and dn 3 . 

Let us plot n l9 n 2 , and n 3 on a system of coordinate axes. In this 
space we construct a parallelepiped with sides dn l9 dn 2 , dn 3 , so 
that its volume is equal to dn±dn 2 dn z . In accordance with the 
foregoing, to each point within this parallelepiped, the coordinates 
n l9 n 2 , and n 3 of which are integers, there corresponds one possible 
particle state in a three-dimensional potential well. There are 
dn 1 dn 2 dn 3 such points within the parallelepiped. Hence, denoting 
the number of states within the volume dN(n l9 n 2 , rc 3 ), we obtain 

dN(n l9 n 2 , n 3 ) = dn x dn 2 dn 3 (28.21) 

Substituting x x , x 2 , x 3 from (28.16), we obtain the expression for 
the number of states in terms of dxj dx 2 dx 3 : 


dN{y. i . x 2 , x 3 ) 


a l a 2 a 3 ^^1 ^^2 
JC 3 


But since a x a 2 a 3 is the geometrical volume of the box, F, it follows 
that 


dN(x u x 3 )= Vd *'*p d * 3 (28.22) 

The numbers x 1? x 2 , x 3 take on only positive values. 

In examining the motion of a particle in a one-dimensional po¬ 
tential well we pointed out that to each value of x x there correspond 
two values of the momentum projection, equal in magnitude and 
opposite in sign. Therefore, if we compare the number of states 
within the intervals dx 1 and dpjh = dpjh, we find that the latter 
includes half the number of states of the former. Accordingly, the 
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number of states in the interval of momentum values dp x dp y dp z is 


dN (p x , Py , P z ) = 


F dPx dpy dpz 

{2nhfi 


(28.23) 


where p xy p y , and p z assume all real values from —oo to oo. 

In other words, we have managed to discard the rather artificial 
assumption that a particle must be moving in a box, and the box 
must be rectangular in shape. Referring, for example, (28.23) to 
unit volume, we obtain a quite general formula for the corresponding 
number of states of a quantum particle, the dimension of which is 
1/cm 3 . Note that Eq. (28.23) can be developed only in quantum 
mechanics, thanks to the finite (nonzero) value of the action quantum. 

Equation (28.23) agrees with the uncertainty relation (22.4). 
If the motion is restricted along x by the interval (0, %), then only 
those states differ physically for which the momentum projections 
differ by not less than 2 nh/a 1 . Hence, within the interval dp x there 
are dpJYlnhla^ = a 1 dp x l(2nh) states. Finding the product 
(a 1 dp x )/(2nh) X (a 2 dp y )/(2nh) X (agC^p2)/(2jr/2'), ^ve arrive at Eq. 
(28.23). In order to assure that the correct numerical coefficient 
is obtained in evaluating the number of states from the uncertainty 
relation the quantity 2 j ih was put in the right-hand side of (22.4), 
or 2 ji in (19.6a). 

Let us now consider the number of states after somewhat changing 
the variables. We plot the momentum projections p x , p y , and p z 
on a system of coordinate axes and count the number of states in 
momentum space lying between two spheres of radii p and p + dp. 
The required number is equal to the integral of (28.23) over the 
volume contained between the two spheres. The corresponding volume 
in momentum space is equal to the surface of a sphere of radius p , 
that is 4 ji/? 2 multiplied by dp: 

dN (,) - j dlf (p„ p„, p.) _ (28.24) 


We now pass from this to the energy of a particle, E = p 2 /(2m). 
Making use of the fact that p = (2 mE) i/2 , p dp = m dE, we find 
the number of states corresponding to the interval of energy values 
from E to E + dE: 


dN (E) = 


Vm 3 / 2 E i/2 dE 

2 1/2 ji 2 /i 3 


(28.25) 


Thus, the number of states in the interval between E and E + dE 
increases in direct proportion to E il2 . In a one-dimensional potential 
well we would obtain (am 1 ' 2 dE)l{2 i l 2 nhE il2 ) oc E~ 1 / 2 . 

In courses in mathematical physics it is shown that (28.25) is 
valid for energy values that are very great in comparison with the 
energy of the ground state. Then the number of states is proportional 
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to the volume and does not depend on its shape. 

One-Dimensional Potential Well of Finite Depth. We shall now 
consider the motion of a particle in a one-dimensional potential 
well of finite depth. We specify it in the following way: 


U = oo 

at 

—oo <x <0 

= 0 

at 

0 ^ x ^ a 

= U 0 

at 

a <; x < oo 


In other words, the potential energy for x > 0 is everywhere equal 
to C/ 0 except within a region of width a near the coordinate origin, 



which region we called the well. For x <0 the potential energy is 
infinite (see Figure 31) 4 . 

Since the solution is of different analytical form inside and outside 
the well, we must find the conditions for matching the wave functions 
at the boundary x = a. Let us take the wave equation in the form 

-^r^S-+ u Wy= E v ( 28 . 26 ) 

in which U(x) is defined by the curve in Figure 31, and integrate 
both sides over a narrow region a — 8^x^a + 8 including the 
point of discontinuity of the potential energy x = a. The integration 


4 It was shown in Section 20 how a three-dimensional wave equation can 
be reduced to a one-dimensional equation by substituting = O/r. Then, 
if the particle’s angular momentum is zero, Eq. (24.32) reduces completely to 
a one-dimensional one, with the difference that now r varies only from zero 
to infinity. This may be attained formally by situating an infinitely high poten¬ 
tial wall at x = 0. Figure 31 actually refers to a particle with zero angular 
momentum in a spherical potential well. 
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gives 

a+6 

-sr[(£L,-(4r)„J= $ (i'-VH>*> (28.27) 

a-6 

Even though U(x) suffers a discontinuity at the boundary of the 
well, it remains finite everywhere. Therefore, when 8 approaches 
zero, the integral on the right also approaches zero. It follows that 
the left-hand side of (28.27) is also zero. In other words 

(^L) =(4^) (28.28) 

V dx J a+0 \ dx Ja -0 

that is, the limit of the derivative on the right is equal to its limit 
on the left. 

This argument would not hold in the problem of an infinitely 
deep well because then the integral in (28.27) would involve an 
indeterminate quantity. 

We shall now show by a limiting process that the wave function 
also does not suffer a discontinuity at the boundary. Let us assume 
the reverse, namely, that suffers a discontinuity A: that is 

(a + 0) — (a — 0) = A. Furthermore, we assume that the 
discontinuity occurs not at a point but within some narrow domain 
close to x — a. At points x — a = ±8 the wave function ties on 
smoothly to a certain solution of the wave equation for the narrow 
transitional domain a — 8^x^a + 8. We thus obtain a stepped 
curve, only the edges of the steps are rounded. The latter follows 
from (28.28): the derivative of the wave function cannot have dis¬ 
continuities at x — a ± 8- 

The order of magnitude of the wave function in the transitional 
domain is dty/dx ~ A/8, so that in the limit, as 8 —>-0, it becomes 
infinite (more precisely, it should become infinite if A =^= 0!). 

Let us now multiply both sides of (28.26) by if and transform by 
parts to get 

We integrate this expression between a — 8 and a + 8, obtaining 





a-6 

a+6 


= - j 2L{E-U)Vdz (28.29) 

a-6 



Quantum mechanics 


351 


and then perform the limiting process 6 —>-0. We may write the 
integrated terms thus: 

(♦Ir ). + o'“ ('►41L.= 1[ * <“+ 1 °> ■<“(-s' Le 

because the derivative, as was shown, is not subject to a discont¬ 
inuity. 

Within the assumed discontinuity region of the \f> function, dty/dx 
is of the order of A/8, but at the boundaries of the region it reverts 
to values independent of 8 and therefore finite in accordance with 
the fact that the edges of the steps are assumed to be rounded. Hence, 
the whole integrated part on the left in (28.29) is of the order of 
(dty/dx) a± &&. The remaining integral is estimated thus: 

fl-f 6 

a-6 

Hence it tends to infinity as 8 tends to zero. The right-hand side of 
(28.29) is finite for 8 —>-0. Thus, the assumption that \|) (x) has a 
finite discontinuity A results in an infinite term in Eq. (28.29), 
which cannot be cancelled out, that is, in a contradiction. To elimi¬ 
nate the contradiction we must assume that A = 0, in other words, 
the wave function does not suffer a discontinuity. 

Thus, at the discontinuities of the potential energy curve the 
wave function is continuous together with its first derivative. 

Actually, we know of no interactions in nature that would cor¬ 
respond to potential energy curves with discontinuities. However, 
there are forces that decrease very rapidly with distance, specifically, 
nuclear forces. As yet we are unable to formulate an exact law of their 
dependence upon distance, but situations are well known when these 
forces pass from very large values to zero over distances substantially 
smaller than the de Broglie wavelength of nuclear particles. In such 
cases the dependence of the force upon distance can be legitimately 
approximated by the stepped curve in Figure 31. Such an approxima¬ 
tion yields quite reasonable results in a number of cases. 

But having assumed the existence of a discontinuity in the poten¬ 
tial energy curve, it was necessary to investigate the behaviour of 
the wave function in the neighbourhood of the discontinuity, so that 
the model of a discontinuous potential curve would be intrinsically 
consistent and not lead to mathematical contradictions. 

Having the boundary conditions, we can now develop a solution 
for the wave equation. The wave equation for the region 0 ^ x ^ a 
(inside the well) is of the form 

h? Z7> . 
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We take its solution 


= C 1 sin xx 


(28.30) 


where x is defined from (28.3). The solution involving the sine only 
is taken because at the left edge of the well, where the potential 
energy suffers an infinite discontinuity, \|) satisfies the boundary 
condition (28.1), (0) = 0. 

The wave equation outside the well, where x > a, is 


h 2 

2m dx 2 


(E — U 0 )rp 


(28.31) 


(i) First we consider the case E > U 0 . Introducing the abbreviated 
notation 




we obtain (28.31) in the standard form (28.4): 


~fa? 


— X?\|) 


(28.32) 


whence 

= C 2 sin x { x + C 3 cos x { x (28.33) 

We must now satisfy the boundary conditions on the right edge 
of the potential well, where U(x) 'suffers only a finite discontinuity. 
According to these conditions both the wave function and its first 
derivative are continuous: 

C 1 sin xa = C 2 sin x x a + C 3 cos x x a (28.34) 

xC 1 cos xa = x 1 C 2 cos x x a — x x C 3 sin x x a (28.35) 

From these equations we can determine C 2 and C 3 in terms of C 1% 
thereby matching the solution of the wave equation outside the well 
with the solution inside the well. Equations (28.34) and (28.35) 
are linear with respect to C 2 and C 3 and have solutions for all values 
of the coefficients: 


p xi sin xa sin x\a-\-x cosxa cos xia 
2 “ 

r x\ sin xa cos Xja — x cos xa sin xja 
3 Ki 


Ci 

Cl 


The only exception is the value of E at which x t = 0, that is E = U 0 . 
This point does not belong to the spectrum of the permitted energy 
values. But at E > U 0 the Schrodinger equation always has a solu¬ 
tion. There is no discreteness in the eigenvalue spectrum. 

We can choose the gauge of the potential energy in this problem 
so that it is zero at x = oo, that is, consider it equal to zero for 
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x > a and equal to — U 0 for 0 ^ x ^ a . Then the case which we 
have just considered corresponds to positive eigenvalues of the total 
energy. 

(ii) Now let E < Z7 0 - We introduce the quantity 

^-(U 0 -E) = k 2 (28.36) 

The wave equation is now written differently than for E> t/ 0 , 
namely 

■ 3 -* 


and its solution is expressed in terms of the exponential function 

Tj) = C^e KX + C 5 e-** (28.37) 

But the exponential e KX tends to infinity as x increases. For 

x = oo it would give an infinite probability for finding the particle, 

00 

and no finite value could be assigned to the integral | | if | 2 dx • 

It follows that a physically meaningful solution exists only for 
C 4 = 0 and must be of the form 

y = C 5 e-* x (28.38) 


Let us again try to satisfy the boundary conditions at x = a. 
This time they appear as follows: 

C i sin xa = Cse-** (28.39) 

xC i cosxa= — nC 5 e- Ka (28.40) 

We divide equation (28.40) by (28.39) in order to eliminate C x and C 5 , 
and obtain 

x cot xa = — k (28.41) 


From this equation we find the expression for sin xa: 

K \ 2 — 1/2 


sinxa = ± (1 +cot 2 xa) 1/2 = ± [^1 + J 

-±[i+w fl -±(*r < 28 - 42 > 

Let us reduce this equation to a more convenient form. From (28.3) 


E 112 = ■ 


so that 


a (2m) 


sin xa = ± 


1/2 


xa 


a (2mUo) i/2 


xa 


(28.43) 


23-0452 
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Only those solutions should be chosen for which cot xa is negative, 
in accordance with (28.41). Hence, xa must lie in the second, fourth, 
sixth, eighth, and in general only even, quadrants. 

We shall solve Eq. (28.43) graphically (Figure 32). The left-hand 
side of the equation is represented by a sinusoid, while the right- 
hand side is represented by two straight lines with the slopes 
±h[a (2mU 0 ) i/2 ]~ 1 . If the absolute value of the tangent of the 
angle of inclination of these lines is less than 2/jt, they have one or 



several common points with the sinusoid in the quadrants correspond¬ 
ing to the roots of (28.41). The trivial point of intersection, xa = 0, 
does not count because at x = 0 the wave function vanishes every¬ 
where. Thus, in a well of finite depth of the form considered there are 
only several energy eigenvalues. But if 


u o<4r^= u 


(28.44) 


there are in general no points of intersection of the straight lines 
with the sinusoid corresponding to energy eigenvalues (the inter¬ 
section point in the first quadrant does not countl). In Figure 32 
the points of intersection in the even quadrants are marked by small 
circles. 

Of special interest is the case when the energy level lies very close 
to the edge of the potential well in comparison with its total depth. 
Suppose that this is the ground level and there are no other levels 
in the well. Then, assuming the width of the well to be given and the 
depth to be slightly greater than the critical value, t/ cr , we write 


xa = y (l + I/) 


U 0 — U ct (1 -(- v), 
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where v and y are small quantities. Since close to a maximum value 
the sine differs from unity by a second-order quantity, we obtain 
the relationship between v and y: 

a 1 -\-y v 

sin xffl ss 1 = (iW , a , i,*- 


Now, making use of the definition of x, (28.3), 
Eq. (28.43) as 


sin xa= sin (1 +y) 


n 

T 



we represent 

_( U 0 -e \ 1/2 
~\ U 0 I 


Assuming E to differ from U 0 by the small quantity e, we find that 


U n 


_ Jl 2 2 


Jl 2 


=—y- nr y 


2 


whence it is apparent that 

JL— — ( u ° i 

U 0 ~ 16 l U cr 


(28.45) 


Hence, the level corresponds to the condition s U 0 : if the depth 
of the well differs from the limiting depth at which the level just 
appeared by a first-order quantity, the energy level lies at a distance 
of the second order from the upper edge of the well. 

Such a case is actually obtained in a heavy hydrogen nucleus, 
the deuteron. The depth of the potential well corresponding to the 
nuclear forces is estimated as 20-30 MeV, while the proton-neutron 
binding energy, e, is about 2.2 MeV (we recall that the two-body 
problem (the proton plus a neutron) reduces to a one-body problem 
(see Sec. 3)). Thanks to the small binding energy of the particles in 
a deuteron, the wave function of their relative motion outside the 
well has, according to Eq. (28.38), as well as (28.44) and (28.45), 
the following form: 

^C 5 exp[--^-(-^-l)^-] (28.46) 

But it is apparent from this equation that the function falls 
off rapidly at a distance of order 



which is many times greater than the dimensions of the potential 
well (to be more precise, in the case of a deuteron it is not many 
times, but two or three times). But then the integral taken over 
the domain outside the well, corresponding to (U 0 /U CT — l)" 3 , ex- 
ceeds the integral over the domain within the well. In other words, 
the probability of finding a particle outside the domain of nuclear 
forces is much greater than of finding it within that domain. That is 

23 * 
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why the wave function is known much better than the proton- 
neutron interaction. But the properties of a deuteron are calculated 
from its wave function, not directly from its nuclear forces. This 
was used by H. Bethe and R. Peierls to develop a quite satisfactory 
quantitative theory of the deuteron. 

Finite and Infinite Motion. We shall now show that the shape of 
the energy spectrum is related to the type of motion. For E > t/ 0 
the solution outside the well is of the form (28.33). It remains finite 

also for infinitely large x . Therefore, the integral ^ | | 2 dx taken 

over the region of the well is infinitesimal compared with the same 
integral taken over all the space. In other words, there is nothing 
to prevent the particle from going to infinity. Such motion was 
termed infinite in Section 5. 

For E < U Q , solution (23.38), when it exists, falls off exponen¬ 
tially as x -*-oo. Hence, the probability of the particle receding to 
an infinite distance from the origin is zero: the particle all the time 
remains at a finite distance from the well. It is natural to call such 
motion finite, as in classical mechanics. 

Thus, infinite motion has a continuous energy spectrum, while 
finite motion has a discrete spectrum comprising separate values. 
If the depth of the well is very small, there may be no finite motion. 
The latter holds only in the three-dimensional case: in a one-dimen¬ 
sional or two-dimensional well there is always at least one bound 
level with negative total energy. That is why we emphasized that 
the problem of a potential well of finite depth in effect refers to 
three-dimensional motion. In classical mechanics, at E <C.U 0 
motion is in every case finite in a well of any number of dimensions, 
including a three-dimensional well. 

In the course of the solution it becomes apparent that the obtained 
result refers not only to a rectangular potential well. Indeed, if the 
potential energy is gauged to zero at infinity, then the solution with 
positive total energy is of the form (28.33) for sufficiently large x, 
while the solution with negative total energy is of the form (28.38). 
The latter contains only one arbitrary constant, while (28.33) con¬ 
tains two constants. The integral curves of both solutions must be 
extended to the coordinate origin in order that the condition ip (0) = 0 
can be satisfied (we consider that x is always greater than zero). 
Obviously, if we have two constants at our disposal, we can always 
choose them so that the condition if (0) = 0 is satisfied 6 . But a solu¬ 
tion of the form (28.38) containing one constant becomes zero at the 
origin only for certain special values of x. 


6 If (0) = (0) + <7 2 \|> 2 (0), then CjC % = —1|) 2 (0)A|>! (0). 
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We can also offer another explanation of why infinite motion has 
a continuous energy spectrum. The wave function of a particle in 
infinite motion differs from the wave function of a free particle 
only in the region of a potential well. But the probability of finding 
the particle in this region is infinitesimal if the whole region of 
motion is sufficiently large. Therefore, the wave function for infinite 
motion coincides with the wave function of a free particle in “almost” 
the entire space, for which the probability of finding the particle 
is equal to unity, and the energy spectrum turns out to be the same 
as for a free particle. 

If U 0 tends to infinity, the wave function outside the well falls 
off very rapidly. In the limit ( U 0 oo) it vanishes at an infinitesimal 
distance from the boundary x = a, yielding the boundary con¬ 
dition (28.1). 

In the case of finite U 0 the wave function outside the well does not 
become zero at once. Therefore, a nonzero probability exists that 
the particle will be outside the well at a finite distance from it. 

This would have been quite impossible in classical mechanics, 
as is obtained from (28.38) in the limiting transition h —>-0. In 
this case k = oo, and becomes infinitesimal outside the well. 
This, naturally, should be the case: if the particle is situated outside 
the well, its kinetic energy (in the classical sense) is E — U 0 <0. 
But the velocity of such a particle is an imaginary quantity. In 
classical r mechanics this means that a given point of space is abso¬ 
lutely unattainable for a particle with energy E . 

In quantum mechanics, position and velocity never exist in the 
same states as precise quantities. Earlier we interpreted this in 
terms of the uncertainty relation, that is, we considered cases for 
which precision in the concept of velocity for a certain state was 
restricted by the limits 2nhl(mAx). But this is a lower limit and 
has to do with particles which are almost unaffected by forces. 
The appearance of an imaginary velocity in the equation for a bound 
particle shows that the very concept of velocity is not applicable 
to a region of space, however large, for which U > E. We can express 
this differently by saying that, for U >> E, the uncertainty in the 
kinetic energy is always greater than the difference U — E. 

We have seen on the example of the deuteron that a bound particle 
may, even with overwhelming probability, occur in a domain where 
no forces are acting. 

Thus, in classical mechanics there is no analogue for the motion of 
a bound particle outside a well. And that is precisely the domain 
of motion that is decisive in finding the energy spectrum for finite 
motion. 

The Linear Harmonic Oscillator. We shall consider the problem 
of a quantum linear harmonic oscillator. We already know its 
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Hamiltonian from Eq. (27.16). After replacing the momentum opera¬ 
tor by (hli)(d/dx), we obtain the Schrodinger equation 


h 2 <Z 2 i|: . mto 2 x a 

2 m ~ch^ ' 2 


l |5 = E^t 


(28.47) 


Let us now introduce other units of measurement, in particular, 
we shall take the unit of length equal to [h/(m(o)] i/2 y so that 



(28.48) 


The quantity g is dimensionless. The derivative dty/dx is equal to 

d\ J? / mm V 1/2 d\ f> 

dx \ h ) d | 

Further, we put 


2E = e/ico 

In terms of these dimensionless variables, Eq. (28.47) assumes 
the form 


—+ ^ = ^ (28.49) 

Equation (28.49) does not contain any parameters of the problem, 
that is (o, m, or h. For this reason the eigenvalue e can only be an 
abstract number. Comparing this with the expression for energy, 
we see that the energy eigenvalue of a harmonic oscillator is pro¬ 
portional to its frequency co. 

To solve Eq. (28.49), it is convenient to introduce a new dependent 
variable g (g) such that 

^ = g(l)e- v/2 


whence 


£ = -u ® (—Sr ©+-ff-) 

-sgf-=o- !,ft (S ! r (S) - r (S)—25 +%) 

Substituting these relations into (28.49) and carrying out the neces¬ 
sary cancellations, we obtain the equation for the new dependent 
variable: 


- J ^ + 2£-^ = (e-l)s© (28.50) 

The coefficients in this equation contain g in not higher than the 
first power and it is therefore comparatively simnle to integrate it. 



Quantum mechanics 


359 


We seek the solution in the form of the power series 

oo 

S (£) = £o + + ^ 3 + • . • = 2 £n% n 

n=0 

In order to determine the expansion coefficients g n , we must 
substitute the series into Eq. (28.49), differentiate it by terms and 
compare the expressions for the same powers of | n . The first deriva¬ 
tive is 

oo 

" 1 j | p ‘ = Si ~h 2g 2 £4~ 3g 3 £; 2 -f- ... = 2 n Snl n 1 

n=l 

so that 

«-tt- - S 2 “e" 5 ” 

n — 0 

The second derivative is 

oo 

= 2^2 + + ... = 2 (*-■!) 

k=2 

In the last summation we changed the summation index, denoting it 
by the letter k . We shall now revert to n , assuming that k — 2 = rc, 
or k = n + 2. Then 

oo 

= 2 (»+2) («+i)^ n+2 r 

n=0 

Now substituting the expressions for the first and second deriva¬ 
tives into Eq. (28.50) and collecting coefficients of £ n , we obtain 

oo 

2 £”[ — (n + 2)(re + l) g n +z + 2ng n — (e — 1) g n ] = 0 

n=0 

We know that for a power series to be equal to zero, the coef¬ 
ficients of | n must vanish. Thus 

gn +2 = gn ( Wi _ 2 ) (n + 1) (28.51) 

In this way the expansion proceeds in powers of £ 2 because the 
coefficients g n go alternately. 

Let us assume initially that g 0 =£ 0. Then, from Eq. (28.51), 
we find in turn g 2 , g 4 , . . ., g 2h . Not a single odd coefficient appears 
in the series if g ± = 0. From the recurrence formula (28.51), all 
g 2 k +1 successively vanish. Conversely, if g 0 = 0, g x =^= 0, then only 
the coefficients with odd indices remain in the series. It is therefore 
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sufficient to examine the solutions containing either only even or 
only odd powers of £ 2 . To be specific, let us take the series of even 
powers. 

Let us examine the behaviour of the series for large values of £. 
Then terms involving high powers of £, that is, large n, are predomi¬ 
nant. But if n is a large number, then we can neglect the constant 
numbers in comparison with n in the terms appearing in the recur¬ 
rence formula (28.51), obtaining the asymptotic relation 
_ 2 

gn+2 — ~ gn 

Since we have agreed to assume n to be an even number, we put 
n' = 2 n. Instead of g 2n we write g 2n = g n *, so that g n * is the coef¬ 
ficient of the series in powers of £ 2 . For these coefficients the asymp¬ 
totic recurrence relation is written as follows: 



In the function with odd powers of we can take £ outside the 
parentheses and then repeat the derivation of the asymptotic relation 
between the coefficients of the series in the parentheses. Obviously, 
we will obtain the same relationship as for even n. 

It is now easy to find the form of the coefficients themselves for 
large n: 

_ go __ go 

® n ' +1 n(n — 1) (n — 2) ... 1 nl 

although actually we have in the denominator not exactly n 1, be¬ 
cause at n' close to unity the very recurrence relation between the 
coefficients is not correct. The way g n > decreases when n ' increases 
is correct for large n f . Using the obtained expression for g n *, we 
find the asymptotic expansion for the function g(£): 

oo oo 

*©*= s *v(sr= s 

n'=0 n'=0 

Thus, the asymptotic behaviour of g (£) is described by the expo¬ 
nential function e&. But then it is also possible to find the wave 
function for large £: 

However, this form of is quite unacceptable: must remain 

finite at large | and not increase. 

There is only one possibility of obtaining a finite value of ^ (£) 
at infinity. For that the series for g (£) must terminate at some n , 
and all subsequent coefficients g n+2 , g n+4 , etc. should be identically 
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equal to zero. From Eq. (28.51) it is apparent that g n+2 becomes zero 
when 

e == 2n + 1 

where n is any integer or zero. Since g n+4 is linearly expressed in 
terms of g n+2 , it is sufficient for g n+2 to vanish for the series to termi¬ 
nate at g n . It follows that at e = 2n + 1 the function g (£) becomes 
a polynomial. The product of the polynomial multiplied by the 
exponential always tends to zero as £ —>- oo. Hence ip (oo ) = 0. 

Here, the wave function of an oscillator corresponds to its finite 
motion in the same sense as in classical mechanics: the probability 
that a particle will recede to infinity is zero. A discrete energy 
spectrum corresponds to finite motion. From the expression for e 
we find 

E n = ha(n + ±.) (28.52) 

The least possible energy is E 0 = feco/2. All this agrees with the 
results obtained in the preceding section. 

The state with energy E 0 , as mentioned before, is called the ground 
state. The function g (5) of this state terminates already at the zero 
term, so that the corresponding wave function has the form 

^ = (28.53) 

This function does not have any zeros at a finite distance from the 
coordinate origin, which must be the case for the ground state. 

It may be noted that the state with zero energy would correspond 
to a particle at rest at the origin. However, such a state is not com¬ 
patible with the uncertainty principle, since an oscillator in it 
would have simultaneously a coordinate and velocity. 

Let us also find the eigenfunctions of the first and second excited 
states. In the first state E x = feco (1 + 1/2) = 3feco/2. Here we must 
put g 0 = 0, since the first term of the series g (£) = g is not zero, 
e = 3, and all the other coefficients g 2 n+i with odd numbers vanish. 
The even coefficients are absent altogether, because g 0 = 0. In gener¬ 
al, all functions with even n turn out to be even, ^ (—£) = (£), 

and with odd n they are odd, if (— t) = —(£). As was just shown, 
the function with n = 1 has the form 


It becomes zero precisely at £ = 0, that is, has one node. 

In the same way it is easy to find the function \|) 2 . Indeed, E 2 = 
= hco (2 + 1/2) = 5feco/2, e = 5. From the exact recurrence formula, 
the coefficient g 2 is 

1-e 


£>2 — So 


1-2 


— 2g 0 
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whence 

^2 = £o(1 — 2| 2 )e-£V2 

Thejnodes of this function are situated at the points £ = ±1/1^2. 
In general, the function \|? n has n nodes. The functions for the first 
few^rc are shown in Figure 33. 



The energy eigenvalue distribution of a linear harmonic oscil¬ 
lator and its potential energy curve are shown in Figure 34. It is 
very interesting that energy eigenvalues are separated by equal 
intervals. 



The oscillator problem qualitatively resembles the problem of 
a rectangular well of infinite depth, but in a well the energy of a 
level increases in proportion to the square of the number of the 
respective level. 
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More like the problem of determining the energy levels in an 
infinitely deep potential well is the problem of determining the 
eigenvalues of the angular momentum square. In both cases the 
operator whose eigenvalues are being determined is expressed in 
terms of the second derivatives, and the independent variables vary 
within a restricted (finite) domain, 0 ^ x ^ a for the well, and 
0^q)^2jx for the angular momentum square. There¬ 
fore, in both cases the eigenvalues depend on the number quadratic- 
ally: 


E 


n — 


n 2 h 2 n 2 
2 ma 2 ’ 


M\=hH{l + 1) 


EXERCISES 

1. A potential-energy curve is given as follows: the potential energy 
is zero for x < 0, and is equal to U 0 for x > 0 (the potential step). A beam 
of particles is directed from the left along x, that is, the motion is one¬ 
dimensional. Determine the reflection factor at E > £/ 0 . 

Solution. The wave function on the left from the step is 

C ie ipxlhj_c 2e -ipxlh 

On the right (above the step) the function must be sought in the form 
C 3 e i v ' xlh , where p'= [2m (£-1/ 0 )] 1/2 

since by definition in this domain there are no particles moving in the direc¬ 
tion of negative x. Note that the wave e -^ Eir v x )l h travels from left to 
right, and the wave e ~^ Et +v x )!^ f r0 m right to left (see Sec. 18). 

From the boundary conditions at x = 0 we find the ratio | CjCx | 2 , 
that is, the ratio of the squares of the amplitudes of the reflected and incident 
waves, which is in fact the reflection factor 
I c 2 |2_/ p-p' \ 2 
I Cl I v P+P' ) 

For E < U 0 we see that p ' is a purely imaginary quantity, and the 
reflection factor is always unity. 

2. A potential-energy curve is given as follows: the potential energy 
is zero for x < 0 and for x > a. For 0 x a, it is equal to U 0 such that 
U 0 > 0 (the potential barrier). A beam of particles whose energy is less 
than U 0 impinges from the left. Determine the reflection factor. 

Solution. The wave function to the left of the barrier is equal to e ihx + 
+ Ce~ ikx (k = p/h). For simplicity, we put Ci = 1, since we are interested 
only in the ratio CjCx = C. Below the barrier, the wave function has the 
form of a sum of the exponents of a real argument, C\e KX + h e y°nd 

the barrier we again seek the wave function in the form of a wave travelling 
from left to right, C 3 e ihx (there we have only the transmitted wave, while 
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before the barrier we have the incident and reflected waves). The constants 
C , C[, CJ, and C 3 are determined from the continuity conditions for the 
wave function and its derivative at the boundaries of the barrier. The expres¬ 
sions for the constants C and C 3 have, accordingly, the form 
2 (x 2 4- k 2 ) sinh xa 
~ (x -f ik) 2 e~ xa — (x — ik) 2 <? Ka 
^ _ Aikxe~ lha 

3 ~ (x + ik) 2 e" xa — (x — ik) 2 e xa 

and the expressions for the flux on both sides of the barrier are 

/=_ ^r (1H<:|2) ’ /= -7r |C3|i 


Substituting | C | and | C 3 |, we find that both expressions for the flux 
coincide, as could have been expected. 

If xa > 1, that is, the barrier is almost opaque, we have 


c 3 


e ~iha e -xa 

X 


Thus the transmitted flux decreases exponentially with the thickness 
of the barrier. Note also that if E > U 0 , that is, the energy of the particles 
lies above the barrier, some will nevertheless be reflected (| C 3 | < 1). In 
classical mechanics such a barrier does not reflect. 

3. Show that reflection occurs from a potential well for which U = 0 
at —00 < a: < 0, U = —|C/ 0 I at0<la:<;a, and C/=0ata<x< 00 . 

4. Verify the orthogonality property of the wave functions for a well 
of infinite and finite depth. 

5. Show that the functions g n (5), in terms of which the wave functions 
of a linear harmonic oscillator are expressed, can be expressed to the accuracy 
of the constant factor in the form 



Verify this by substitution into Eq. (28.50) at e = 2n + 1. 

6 . Normalize the functions and % of the harmonic oscillator, taking 
advantage of the fact that 

00 00 

J «£=/«, J ge-V dl=Yal2 


7. Verify that in the ground state of a linear harmonic oscillator 
((Ap) 2 )((Ax) a ) assumes the maximum value h 2 ! 4 (see (25.21)). 
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29 


MOTION IN A CENTRAL POTENTIAL 


The motion of an electron in a central attractive field is the principal 
problem in the quantum mechanics of the atom. And it is not neces¬ 
sary to regard the field as strictly Coulomb in character. For example, 
in alkali-metal atoms, an outer electron which is bound relatively 
weakly to the nucleus moves in the field of the nucleus and the 
so-called atomic core, that is, all the other electrons. Such an 
approximate description provides a satisfactory understanding of the 
peculiarities of the behaviour of alkali-metal atoms and their energy 
states without solving the extremely difficult many-body problem 
of quantum mechanics. Even in those cases when the replacement 
of an exact description by the approximate concept of the resultant 
field of the rest of the electrons acting on the given electron is un¬ 
satisfactory from the quantitative aspect, the general qualitative 
picture of the atomic state is nevertheless conveyed correctly and 
helps in the classification of separate states. 

The approximate approach is successful because it correctly re¬ 
presents the spatial distribution of the nodal surfaces of the wave 
function. Therein lies the significance of the general problem on the 
motion of an electron in a central field. Such a field is described by 
a certain function (potential) U(r) whose form need not be specified 
in the most general case. However, in the immediate vicinity of the 
force centre (the nucleus), where the screening effect of the other 
electrons on the given one is least, U(r) tends to infinity according 
to the Coulomb law. Furthermore, we assume that U(oo) = 0. 


The Eigenfunctions of the Angular Momentum Square and Projec¬ 
tion (Spherical Functions). Referring to (24.35), we write the Hamil¬ 
tonian of a particle moving in a central force field as follows: 


<$= - 


h 2 
2m 


1 _ 

r 2 



M 2 
2m r 2 


U(r) 


(29.1) 


Since M 2 involves differentiations only with respect to angles, 
it commutes with the energy operator $£. Hence, in one and the 
same state we have eigenvalues for the energy, the angular momen¬ 
tum square M 2 , and the angular momentum projection Af z , which 
commutes with both the Hamiltonian and the angular momentum 
square. 

However, M z is not involved in the Hamiltonian, hence M z may 
have different eigenvalues hk for one and the same energy E of the 
system. To the same energy E there corresponds a whole set of eigen- 
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functions E ik » where — l ^ k ^ Z. States described by such wave 

functions are termed degenerate. 

It is not difficult to establish the criterion of degeneracy, that is, 
the conditions at which it appears. It was shown in Section 24 that 
different angular momentum components, for example M x and M ZJ 
do not commute, but they commute with the angular momentum 
square and, as can be seen from (29.1), with the Hamiltonian of 
motion in a central field. Let us operate with M x on the function tyEih- 
Since it is not an eigenfunction of the operator, we get an expansion 
involving all the eigenfunctions of M z : 

i 

M x \\>Eih— 2 Ckh'^Eik' (29.2) 

h'=-l 

(As was shown in Section 25, the square of the expansion coefficient 
modulus, | c kt k / | 2 , is equal to the probability that, in passing 
through a magnetic field directed along the x axis, a particle with 
the given angular momentum projection M z = hk will be found 
in a beam where M x = hk' .) The wave function (29.2), which is 
equal to M x tyEik, is at the same time the eigenfunction of the 
Hamiltonian (29.1) because M x commutes with the Hamiltonian; 
each term tyEik' is an eigenfunction of ${, but not of M z . Thus, 
because the operators M x and M z do not commute, different wave 
functions, y^Eik and M^Eik* correspond to the same energy, that is, 
degeneracy occurs. 

It is obvious that this reasoning is applicable to the eigenfunctions 
of any operator X, provided there are two other operators, |x and v, 
that commute with X and do not commute between themselves: 
p/i|)xv does not coincide with \|) Xv , but at the same time is an eigen¬ 
function of X, since p,Xi|) = X (p/i|)) = X (ji^). 

In Section 24 we developed the eigenfunctions of the operators M 2 
and M z and represented them in the form (24.39). Here we shall 
express the same functions in a somewhat different manner more 
convenient for direct computations. 

Restricting ourselves to the case of k = 0, we find that the wave 
function (24.39) can, with the addition of a numerical factor, be 
represented as 

y ’- i 7r- r< ' +,, (i-)'-r < 29 - 3 > 

The factor was introduced for the definition of Y\ to coincide with 
standard notation. It can be seen from the form of (29.3) that this 
formula is an expression of a homogeneous coordinate function of 
zero dimensions. Hence, it is a function only of angles, as it should 
be. Furthermore, since the wave function (29.3) involves only 
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differentiation with respect to z, it contains only the angle between z 
and r, or more precisely, its cosine (cos '0'). Thus, the function de¬ 
fined by (29.3) is a polynomial of cos 'O’. 

This polynomial satisfies the following differential equation: 


sin ft dft 




d 


d cos ft 


(1 - cos 2 ft) n = — l (l +1) Y1 (29.4) 


which is obtained from (24.35) if we put p^ = M z = hk = 0, since 
the eigenfunction of the angular momentum square, Y\ (cos ft), 
corresponds to the zero value of the angular momentum projection 
on the z axis. 

Equation (29.4) is a second-order equation, hence it has two li¬ 
nearly independent solutions. However, only one of them, namely 
y® (cos ft), is regular, that is, has no singular points. It is not hard 
to verify that in the general case (for arbitrary l) the second solu¬ 
tion has singular points at cos ft = ±1, but we shall restrict our¬ 
selves to the proof for 1 = 0. The regular solution is Y° 0 = 1, while 
the second solution, as can be seen from (29.4), is determined as 
follows: 


d 

d cos ft 


(1 — COS 2 ft) 


dY° 0 
d cos ft 


0 , 


(1 — cos 2 ft) 


dY° 0 
d cos ft 


c 


y"0 _/nr P d COS ft 

0 J 1 — cos 2 ft 


It becomes infinite at ft = 0 and ft = jt. 

Consequently, if there exists a regular solution for (29.4), it 
coincides with Yf up to a constant factor. 

We define the function Pi(u) as 

P,(u) -(29.5) 

2 l l ! dul 


Let us verify whether it satisfies equation (29.4). We take the 
expression 


y s (u 2 — l) 1 


(29.6) 


which satisfies 


( 1 — “ 2 )"5jr + 2Zuj/ = 0 


(29.7) 


Differentiating this relation l + 1 times with respect to w, we get 
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—2(I + l)a-| r (-| r ) y—l(l+l) (-i-) y 

+ 2l “-l {^r)"«+ 2l «+i) {-£-)' y 

= (!-“*) -£r{-k)' y- 2u -£r(-zr)‘» 

+'<'+!) (^r)' * 

= < 29 - 8 ) 

Thus, if we put u = cos ft, the expression ( dldu) l y will satisfy 
the same equation (29.4) as Y\ (cos ft). The numerical factors mul¬ 
tiplying Y7 and in the definition (29.5) have been so chosen as to 
obtain the exact equality Yf = P l ( cos ft) (see Exercise 1). The 
polynomials (29.5) are known as the Legendre polynomials . 

To round out the picture, we present without proof the expression 
for spherical functions for k =^= 0: 

[(-1 ) k YT h (u, q>)r=K?(u t <p) 

(i)' + V- 1 >' e ‘ w (M-9) 

Let us now find the normalization constant for the Legendre poly¬ 
nomials Pi from the condition 

i 

1 =af J Pf ( u) du 

-l 

Substituting the expression (29.5) and integrating l times by parts, 
we obtain 

i 

i=«f—r— (- 1 ) 1 ((“*-!)'(-srfV-i) 1 ** 

(2 l l !)* V dU 1 

All the integrated terms vanish at the limits, because the order 
of the derivative in one of the factors in them is less than Z, and 
after differentiation there remains a certain power of ( u 2 — 1) not 
equal to zero. The (2Z)th derivative of (u 2 — 1)* is equal to (2Z)I. 
Thus we arrive at the normalization condition: 
i 

a 2 ^ du = l 
(2 

The integral here is called an Euler integral of the second kind . Substi¬ 
tution of (1 + u)/2 = v reduces it to standard form: 
i i 

(1 —u 2 )*du=2 2I+1 J v l {\ — v) l dv 
-i o 
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and is equal to 

2*1+1 (l J)2 
( 21 + 1 ) ! 

We thus find the normalization coefficient a*: 

a i = (“T~) 1/2 (29.10) 

For k 0 the normalization coefficient is 

<»•«> 

Radial Functions. The equations for radial functions are investi¬ 
gated as follows. We begin by substituting the wave function in the 
form 

i|) = ]((r)y| (cosd, \p) (29.12) 


into the Schrodinger equation = E\p, where $£ is given by (29.1). 

The operator M 2 involved in $£ in operating on Y* yields a constant 
number h 2 l (l -f 1). The partial derivative with respect to r applied 
to % (r) should be replaced by the total derivative. As a result we 
obtain the equation for the radial function: 


w_jLjL 

2m r 2 dr dr 


h*l (l- f-1) 
2 mr 2 


X + U (r)% = E% 


(29.13) 


Itjis convenient to reduce this equation to one-dimensional form 
by means of the substitution 


X 


£ 

r 


(29.14) 


Without repeating the computations we used to transform Eq. (20.5), 
we write the final equation for finding the energy eigenvalues: 


h 2 d*R 
2m dr 2 


h*l(l+l)R 
2 mr* 


U(r)R = ER 


(29.15) 


As long as the form of U (r) has not yet been made definite, we 
can consider (29.15) only in two limiting cases: for very large and 
for very small distances from the nucleus. 

The field of the atomic core is not effective at very small distances 
from the nucleus, and there remains only the Coulomb law U (r) = 
= —Ze 2 /r (Z is the atomic number of the element). However, if 
r is very small, the term [h 2 l (l + l)/(2mr 2 )] R is, in any case, larger 
than the term — (Ze 2 /r) R, and all the more greater than ER. For the 
time being we put l =^= 0. Hence, in direct proximity to the nucleus 
the wave equation is of a very simple form: 


d*R 

dr 2 r* 


(29.16) 


24-0452 
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In this form it is solved by the substitution 

R = r* (29.17) 

so that 

a ( a _ 1) = l (l + 1) (29.18) 

This equation has two roots, 

a = l + 1 and a = —l (29.19) 

But from (29.14), the second root yields % = at the point 

r = 0 such a function % becomes infinite for all Z. Therefore, we 
must discard the root a = — l and take the dependence of R on r 
for small Z’s in the form 

R = Crt+ 1, %=Cr l (29.20) 

For 1 = 0 the Coulomb term must be retained in the equation. 
This equation is solved by substituting R in the form of a series 

R = a^r + a 2 r 2 + . . . (29.21) 

so that the dependence of the wave function upon the radius at 
small distances from the nucleus has the form (29.20) for all Z’s. 

The greater the angular momentum, the higher the order of the 
wave-function’s zero at the coordinate origin. Only for Z = 0 does 
it remain finite close to the nucleus: % = R/r = a t . This can be 
understood by analogy with classical mechanics: angular momentum 
is the product of momentum by the “arm”, i.e., by the distance from 
the origin; Z = 0 corresponds to a zero “arm” and a zero angular 
momentum. Therefore, there is a nonzero probability of finding the 
electron at the origin. In the old version of quantum mechanics (due 
to Bohr) the electron orbit with zero angular momentum passed 
through the nucleus, which could not be explained. The larger angular 
momentum values correspond to larger “arms” and, correspondingly, 
in quantum mechanics, to a smaller probability of finding an electron 
close to the nucleus. 

The behaviour of the wave function close to the origin can also be 
explained as follows. A repulsive centrifugal force acts on the particle; 
to this force there corresponds an effective potential h 2 l (Z -f l)/(2mr 2 ) 
(see Sec. 5). This limits the classically possible region of motion for 
small r’s. In quantum mechanics the particle penetrates the region 
where according to classical mechanics the velocity would have to be 
imaginary. But the wave function decreases rapidly with penetra¬ 
tion into this region, that is, as it approaches the origin of the 
coordinate system. The decrease in this case is according to the 
power law: as r-> 0, the wave function decreases as r z , that is, 
the damping law is the stronger the higher the barrier. At Z = 0 
there is no barrier, and nothing prevents the electron from drawing 
infinitely close to the nucleus. 
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Let us now examine the region of large values of r. For large r’s, 
in Eq. (29.15) we can discard the terms that decrease as r increases, 
that is, the centrifugal and Coulomb energy (the latter is gauged 
to zero at infinity: U (oo) = 0). Then the equation is greatly 
simplified: 


<PR 2mE p 

Hi* ~ w~ n 


(29.22) 


Its general solution appears thus: 

D n / (-2 mE) {/2 \ „ I ( — 2 mE) {/2 \ 

R = Ci exp ( —--r]+C 2 exp ^-r) 

(29.23) 


(i) Let the energy be positive, E > 0. Here, R appears as follows: 


R = C 1 exp ( i ^ m r)+C 2 exp ( 


i(2mE) l/2 


) (29.24) 


Both terms remain finite for any value of r. Therefore, the two 
constants C x and C 2 must be retained in the solution. We came 
across the same situation in Section 28 in considering the solution 
of the wave equation (28.33) for a potential well of finite depth. 

Any general solution of a second-order differential equation 
involves two arbitrary constants. Let us suppose that the so¬ 
lution (29.20), which holds for small r’s only, is continued into the 
region of large r’s, where it is not of the simple form r l+1 but 
nevertheless satisfies the precise equation (29.15). A certain integral 
curve is thus obtained for this equation. But any integral curve can 
be represented by properly choosing the two constants in the general 
solution. As r tends to infinity this solution acquires its asymptotic 
form (29.24), if E > 0. The expression (29.24) remains finite when 
r-> oo for any constants C x and C 2 . It follows that, for a positive 
energy, the wave equation always has a finite solution for any values 
of r. Therefore, the energy region E > 0 corresponds to a continuous 
spectrum, since the wave function satisfies the required conditions 
at zero and at infinity for any E > 0. In accordance with (29.24), 
the probability of finding an electron for r -> oo does not become 
zero; that is, this case corresponds to infinite motion, as in the classi¬ 
cal problem considered in Section 5 (see also Sec. 28). 

Thus, the general rule that infinite motion possesses a continuous 
energy spectrum has been confirmed. 

(ii) Now let E < 0, or E = — | E |. Then (29.23) must be re¬ 
presented thus: 

*_C,exp(-<i^ r) + C,exp 

(29.25) 

24* 
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Here the first term tends to infinity with r, and we must therefore 
put Ci = 0, so that R involves one arbitrary constant instead of two: 

R — C 2 exp ( — r ) (29.26) 

If we now draw an integral curve from the coordinate origin, 
starting from r = 0, where it has the form (29.20), for large r, as 
a rule, it will not be reduced to the form (29.26). For all negative 
energy values, except certain ones, the integral curve is represented 
in the form (29.25) at infinity and, hence, does not satisfy the bound¬ 
ary condition imposed on the wave function. Only for those energy 
values for which the plot of integral curve is such that 

Ci ( E) = 0 (29.27) 

does the wave equation have a solution satisfying the boundary 
conditions. This corresponds to a discrete energy spectrum. At the 
same time, R (oo) becomes zero, so that the finite motion has a dis¬ 
crete energy spectrum, as expected. 


The Coulomb Field. Let us now see how a discrete energy spectrum 
is found in the case of a purely Coulomb field: 

u (r) = —(29.28) 

This occurs in a hydrogen atom (though not in a molecule!), in singly 
ionized helium, doubly ionized lithium, etc. 

The wave equation (29.15) is now written as 


h 2 d?R 
2m dr 2 


h 2 l (l+l) 
2 mr 2 


R 


Ze 2 


R= —\E\R 


(29.29) 


where we have straightaway taken the case of negative energies, 
which leads to a discrete spectrum. 

It is convenient here to change the units of length and energy 
similar to the way it was done in the problem of the harmonic 
oscillator (Sec. 28). In place of the system where the basic units are 
the arbitrary quantities centimetre, gram, second we take the 
following units: the elementary charge e, the mass of the electron m, 
and the quantum of action h. From these quantities we form the unit 
of length: 

= (5.29172 ± 0.00002) x 10' 9 cm 


and the unit of energy: 

= (27.20976 ± 0.00044) eV 

(the units of length and energy are expressed in terms of the actual 
mass of the electron; in a hydrogen atom we must take the reduced 
electron-proton mass). 



Quantum mechanics 


373 


Hence, if we put e = l,m = l,ft = lin Eq. (29.29), then length 
and energy is measured in these units. Let us denote this length £: 

(29.30) 

and energy e: 

e = ^r|£| (2931) 

so that, of the constants, the wave equation will involve only the 
atomic number Z: 

—*g-|- R-2Z-R= -2 sR (29.32) 

We look for the solution of this equation in the form of a series 
expansion. We shall proceed here from the solutions obtained for 
large and small values of g (or r). 

In accordance with Eqs. (29.20) and (29.23), we write R in the 
following form: 

R = | d + l, c -6(28)l/2 (Xo + %i l + ^ + . . . ) 

= 1<I + I>e-«2«) 1/2 S Xnl n = e-K^ l/2 2 XnV" 1 *' (29.33) 

n= 0 n=0 

The first factor determines the form of R as | -> 0, the second 
factor should basically correspond to the form of R for large g’s, 
and the series interpolates, as it were, between the limiting values. 
Differentiating (29.33) twice, we obtain 

oo 

-|r =2ee -K2e) 1 / 2 ^ Xn| n+I+1 

71=0 

OO 

-2 (2e) 1/2 e-^) 1/2 ^ (n + l + i)% n l n+t 

71=0 

OO 

+ e-«2«) 1/2 2 (»+^ + l)(» + 0XnE n+M 

71=0 

The first term on the right is simply —2 eR. Hence, it cancels with 
the same term in (29.32) on the right. We group the remaining 
terms so that in one of them the degree of £ is everywhere less by 
unity than in (29.33) and, in the other, less by two units. In addition 
we eliminate the exponential factor. We shall now have an equality 
between two such series: 

2 I* (l + 1) - (n +1 +1) (n + 1)] Xni n+ ' _1 

71=0 

= 2 12Z - 2 (2e) 1/2 (n + l + 1)] x „r + ' 

71=0 
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Such an equality is possible only when the coefficients of the same 
powers of | coincide. In the left-hand side the power £ n+/ will have 
a coefficient involving %n+i- Hence 


y =v 2[Z —(rc + i + l)(2e) 1/2 ] 
An+l Kn J(Z + l)_(n + i+l)(ra-H + 2) 


(29.34) 


From the relationship (29.34), all the %„’s are determined con¬ 
secutively. We must neglect the constant numbers Z , l and l (l + 1) 
in Eq. (29.34) in comparison with n when n's are large; there then 
remains the limit 

Xn + i = Xn — f ' 2 (29.35) 


We met with a similar limiting expression in the problem of the 
linear harmonic oscillator (Sec. 28). In the case of large £’s it reduces 
the whole series to an exponential form: 

2xnS n «e 2|(2e)1/2 (29.36) 


But such a series cannot give a correct expression for R (5), because, 
if we substitute (29.36) into (29.33), we obtain (oo) = oo despite 
the boundary condition. However, if all the coefficients become zero 
from a certain x n +i onwards, the series (29.33) becomes a polynomial. 
Then, being multiplied by g-S( 2 c)i/ 2 ? ^ gives (oo) = 0, as required. 
It can be seen from (29.34) that Xn+i vanishes if 

Z—(n+l + l)(2e ) 1/2 =0 (29.37) 

or 


6 2(n+Z+l) 2 


(29.38) 


Finally, going over to conventional units and taking into account 
the sign of the energy, we obtain the required spectrum: 


, E = 


Z 2 me* 

2 h 2 (n + Z-f 1)2 


(29.39) 


Quantum Numbers. The number n is the power of the polynomial 
2 XnS 1 involved in the wave function expression. A more detailed 
analysis shows that the polynomial has exactly n real roots. Since 
Xo ¥= 0 and % n 9 ^ 0 , none of the roots are equal to either zero or 
infinity. Therefore, if we examine the dependence of the wave func¬ 
tion on the radius, it has n zeros, or “nodes”, not counting the zero 
at | = 00 , associated with the finiteness of the motion, and at | = 0 , 
which occurs in all functions with l =/= 0. The term “node” is used 
instead of “zero” by analogy with the nodes of a vibrating string fixed 
at both ends (the latter problem is completely analogous to the 
motion of a particle in an infinitely deep potential well). 
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Obviously, the wave function has zeros not only in the case of 
a hydrogen atom. Therefore, the number n r , or the number of nodes 
of the function at a finite distance from the origin of the coordinate 
system, can characterize the state of an electron in any atom, insofar 
as, from the qualitative aspect, it is legitimate to describe the 
action of all other electrons on an individual electron with the help 
of the effective potential U(r). For the same reason, it is possible 
to describe the state of a separate electron with the help of the 
number Z, which is used to express the square of the angular mo¬ 
mentum. There is one more number that is expressed in terms of n r 
and Z; it is denoted n and related to n r and Z as follows: 

n = n r + Z + 1 (29.40) 

These quantities can also be used for complex, many-electron 
atoms, even though the energy of the electrons in them cannot be 
expressed by the simple formula (29.39). The numbers n, n T , and 
Z are convenient for classifying electron states. 

The number Z is called the orbital quantum number of an electron. 
In spectroscopy the following system of notation is accepted: an 
electron state with Z = 0 is called the s state, with p, d, and / states 
corresponding to Z =? 1, 2, and 3. Higher values of Z do not occur in 
unexcited atoms. The total angular momentum of an atom as a 
whole is found by adding the angular momenta of individual elec¬ 
trons (see following section). 

As we know, the projection of the angular momentum of a separate 
atom on some axis is equal to hk , where —l ^ k ^ Z. The integer k 
is called the magnetic quantum number , since as a rule reference is 
to the axis along which an external magnetic field is directed. 

Next, n r is the number of zeros (nodes) in the radial wave func¬ 
tion, and it is called the radial quantum number . 

Finally, the sum (29.40), n, is called the principal quantum number 
of an electron in an atom. 

From (29.39) and (29.40), the energy of an electron in a hydrogen 
atom expressed in terms of the principal quantum number is 

--¥- eV < 29 ' 41 > 

If an external source imparts this energy to the electron in a 
hydrogen atom, the electron may be ejected from the atom. For this 
reason the corresponding energy is known as the binding energy . 
A formula analogous to (29.41) is obtained also for the positive 
helium ion. Apart from the Z 2 = 4-fold difference, there is a more 
subtle distinction stemming from a slight difference between the 
reduced mass of the helium atom and the reduced mass of the hydro¬ 
gen atom due to the differences between masses made up of nuclei 
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(we recall that the reduced mass of an atom with one electron closely 
approximates the mass of the electron). 

The state with n = 1 is the ground state. The atom cannot emit 
light in this state, because there is simply no lower state into which 
it can pass. 

We pointed out at the beginning of this section that the energy 
of a particle moving in a central field can depend only on the square 
of its angular momentum, but not on the latter’s projection. In 
other words, wave functions corresponding to identical Z’s and, 
as we now know, identical n r 's but different A;’s correspond to the 
same electron energy. This is what is called degeneracy. 

We find a somewhat unexpected result for the Coulomb field: 
the energy depends not on the two quantum numbers n r and l but 
only on their sum n r + Z. For different n r and Z, but such that their 
sum is the same, the energy eigenvalues are also the same. Conse¬ 
quently a further degeneracy of energy states occurs, since to different 
Z’s correspond quite different spherical functions Y h r 

Note should be made of a substantial difference between degeneracy 
with respect to k and with respect to Z. The first occurs for any analyti¬ 
cal form of the potential U (r) and is connected only with the sym¬ 
metry of the force field acting on the particle. In the case of U = 
= U (r), it is symmetry of all rotations around the origin of the 
coordinate system. This type of degeneracy is called necessary and 
is due to the symmetry specified in the conditions of the problem 
on the determination of energy eigenvalues. 

Degeneracy in a Coulomb field, however, is due wholly to the 
specific form of the dependence U = —Ze 2 /r. If, for example, the 
potential energy curve is of exponential form, U = —a/r b , de¬ 
generacy with respect to Z occurs only when the exponent in the 
force law is 6 = 1. This type of degeneracy is called accidental : 
it occurs at some select value of the parameter involved in the 
Hamiltonian. 

There is a deep connection between the accidental degeneracy 
of states in a Coulomb field in classical and quantum mechanics. 
In classical mechanics (Sec. 10) it was shown that in Keplerian 
motion energy depends only upon the sum of the adiabatic invariants 
J r and /q,, but not on each of them separately. Subsequently (Sec. 31) 
it will be shown that there is a very simple relationship between 
adiabatic invariants and quantum numbers. Therefore, Eq. (29.39) 
is a direct quantum analog of the classical formula expressing the 
dependence of energy on the adiabatic invariants (see Exercise 3, 
Section 10). 

In turn, the classical form of the dependence of energy upon the 
sum of the adiabatic invariants J T and determines the closed 
nature of the path of finite motion in the Kepler problem (an ellipse). 
In the relativistic Kepler problem (Exercise 9, Section 14) the 
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path is not closed, and the dependence of the energy upon the adia¬ 
batic invariants is more complex. Correspondingly, in quantum 
theory, too, there is no accidental degeneracy with respect to l. 

Parity of State. The state of an electron in an atom is characterized 
by one more property which, unlike energy and angular momentum, 
has no classical analogue. This property relates directly to the 
wave function itself and is associated with its behaviour under 
changes of the signs of all three coordinates (cf. Sec. 15). 

Consider the wave function of an electron. The form of the wave 
equation (29.15) does not change if we substitute 

x = —x', y = —y', z = —z (29.42) 

As we know from Section 15, this transformation is called in¬ 
version, and it transforms a right-handed coordinate system into 
a left-handed one, and vice versa. No spatial rotation of the axes 
can make the two systems coincide. 

The wave equation (29.15) is linear. Therefore, if it has not changed 
its form, its solution, determined by the boundary conditions up 
to a constant factor, can acquire only a certain additional factor: 

^ ( x , y, z) = C\p ( x ', y\ z') = C\p (— x , — y, — z) (29.43a) 

But, in principle, the primed left-handed system in no way differs 
from the unprimed right-handed system. Hence, the reverse trans¬ 
formation must involve the same transformation factor C : 

of ( xy', z') = 1|5 (—x, —y, —z) = Cty (x, y, z) (29.436) 
Substituting this into (29.43a), we obtain 

yip (*, y, z) = C 2 ty ( x , y, z) 


whence 


C 2 = 1, C = ±1 (29.44) 

The function is said to be even for C = 1 and odd for C = —1. 
The wave functions of a linear harmonic oscillator possessed a si¬ 
milar property: the energy operator was also even, 3£{x) = 3£\—x), 
while the parity of the wave functions alternated, depending on 
the eigenvalue number n , that is, they were either even or odd. 

It is not hard to establish the quantum number that determines 
the parity of an electron’s wave function in a central field. From 
the spherical function expression (24.39), it is obtained by differen¬ 
tiating the function r _1 with respect to the coordinates l times. 
Consequently, the number l fully defines the parity of the wave 
function of a particle in its coordinate dependence.* 
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If a many-electron atom is considered, the parity of its total wave 
function is equal to the parity of the number ^ ^ where l t are the 
orbital quantum numbers of the individual electrons. 

The parity of a wave function can be represented in operator form 
by introducing an inversion operator G such that 

Gty(x, y, z) = \!>( — *, —y, — z) (29.45) 

Since the Hamiltonian for an atom is an even function of the 
coordinates, we can write 

Gm = m (29.46) 

From this it follows that the parity operator commutes with the 
Hamiltonian: 

GS8^ = SSG^ (29.47) 

The eigenvalues of the operator G are, in accordance with (29.44), 
the numbers C = ±1, since 

G\|? = \J? ( — x, —y, —z) = C\p (29.48) 

According to (29.47) and (27.8), these numbers exist simultaneously 
with the energy eigenvalue. 

Speaking of a separate electron, giving its parity of state provides 
no new information, since parity is determined by the orbital quan¬ 
tum number Z. In a many-electron atom the state is given not only 
by the quantum numbers of individual electrons, but also by how 
the angular momenta of individual electrons add up into the resultant 
angular momentum of the atom as a whole. 

Addition of Angular Momenta. To begin with, we shall consider 
the rule for the addition of the angular momenta of two electrons 
in an atom. The angular momentum of each electron is not, strictly 
speaking, an integral of the motion, since the electron moves in the 
field not only of the nucleus but of the other electron as well. Such 
a field does not possess symmetry with respect to rotations about the 
nucleus, therefore only the total angular momentum of both electrons 
can be conserved, but not the angular momentum of each one sepa¬ 
rately. 

Nevertheless, from the qualitative aspect, the electrons’ orbital 
quantum numbers l x and Z 2 continue to provide a correct description 
of their respective states. They h^lp to define the exact quantum 
number of the total angular momentum. 

We shall reason as though l x and Z 2 were also exact quantum num¬ 
bers. But having determined the total angular momentum with their 
help, we must bear in mind that only it is an exact quantum integral 
of motion of the system. 
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To make the reasoning specific, let l x > Z 2 . We project the smaller 
angular momentum on the direction of the greater. Since the projec¬ 
tion of Z 2 varies from Z 2 to —Z 2 , we find that in sum with the greater 
angular momentum l x we obtain the following possible maximum 
projections of the total angular momentum (in units of h): 

L = Zi -f- Z 2 , H - Z 2 — 1, l x -f- Z 2 — 2, . . l x — Z 2 

(29.49) 

Each of these values defines the square of the resultant angular 
momentum according to the formula 

Ml oi3Ll = h*L(L + 1) (29.50) 

It is not difficult to derive this formula by the method of mean 
values in the same way as we developed the formula for the angular 
momentum square of a separate electron (Exercise 1, Section 25). 

Thus, the angular momentum of two electrons varies within the 
limits from L = l x + Z 2 to L = \l x — Z 2 |. The projection of the 
resultant angular momentum L may vary from —L to L. The rule 
set forth here agrees with the fact that the magnitude of the sum 
of two vectors lies between the sum and the difference of their absolute 
values. 

By analogy with the states of individual electrons labelled s, /?, d, / 
for Z = 0, 1, 2, 3, the states of an atom are denoted by capital letters 
5, P, Z), F, ... for L = 0, 1, 2, 3, ... .To larger L’s correspond 
the letters following F in alphabetical order. 

The generalization of the rule presented above for the case of 
three or more electrons is self-evident: first any two angular momenta 
are added according to the rule (29.49), then a third is added accord¬ 
ing to the same rule, etc. 

The Simultaneous Operation of the Angular Momentum and Parity 
Conservation Laws. We have thus established that a system of elec¬ 
trons in the central field of a nucleus is subject to two conservation 
laws: conservation of total angular momentum and of total parity. 
Unlike the case of one electron in a system of electrons, these two 
laws by no means reduce to the same thing. The total parity is 
found by arithmetic addition of the individual numbers Z, while the 
total angular momentum is determined by geometrical (vector) 
composition. Therefore, to describe the state of a system of electrons 
we must define the angular momentum and parity corresponding to 
that state. 

Let us now examine the restrictions that can be imposed on the 
possible transitions between different atomic states by these two 
conservation laws operating together. As an example we take an 
excited many-electron atom with total angular momentum L = 0, 
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that is, in the S state. The atom has 5-electrons and, we assume, an 
odd number of /^-electrons. Consequently, the atom is in odd state 
(its parity is determined by the addition of an odd number of units 6 ). 
Furthermore, let the total excitation energy of the atom be sufficient 
to eject one /^-electron, but such that as a result of the rearrangement 
of the electron cloud the atom or ion remains again in the S state, 
with L = 0. Since the angular momenta are added vectorially, 
such a state may occur for either an even or odd number of ^-electrons. 

The condition that the total angular momentum must be conserved 
requires that the electron emitted from the atom have l = 0, since 
the total angular momentum of the initial system was zero, and 
after the emission of the electron there remains a system whose 
angular momentum is by definition zero. In other words, the assump¬ 
tion has been made that this is the only state in which the electron 
possesses sufficient energy for emission. Hence, the angular mo¬ 
mentum conservation law requires that the angular momentum of the 
emitted electron also be zero. 

These considerations show that, given the appropriate initial 
assumptions, the laws of conservation of energy and angular mo¬ 
mentum can be satisfied. Let us see if the parity conservation law 
also holds in these circumstances. The remaining system now has 
two p-electrons with l = 1, so that its state is even. The emitted 
electron has, in accordance with the angular momentum conservation 
law, l = 0, that is, its state is also even. Consequently, the ultimate 
state of the system must also be even, whereas its initial state was 
assumed to be odd. It follows that, by the parity conservation law, 
the transition considered is impossible. The energy and angular 
momentum conservation laws, which have classical analogues, do not 
preclude such a transition, which is impossible according to the 
quantum law of conservation of parity, for which there is no classical 
analogue. We have examined a typical case of a transition “forbid¬ 
den” for considerations of parity (from L = 0 to L = 0 with an 
assumed change in parity). 

We repeat that the law of conservation of parity is independent 
of the law of conservation of angular momentum, because parity is 
determined by a different law of addition than angular momentum. 

In quantum mechanics, the angular momentum conservation law 
must always be applied together with the parity conservation law. 
Common to both these laws is that they derive from the invariance 
of the equations with respect to the spatial orientation of the coor¬ 
dinate axes. But the orientation of the axes can be changed not only 

8 The vector addition of three unit angular momenta may yield zero in the 
following way. Two vectorially added momenta may yield an angular momentum 
equal to unity, since their resultant momentum varies between 0 and 2. The 
resultant unity may yield zero when added to l = 1 for the third electron. 
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by rotations: an additional transformation is inversion, which is not 
reducible to any rotation. This is what the parity conservation law 
provides for in addition to the angular momentum conservation law. 

In this form the parity conservation law is unconditionally appli¬ 
cable to systems in which electromagnetic interactions occur. This 
is an experimental fact on the basis of which the equations of electro¬ 
dynamics are made invariant with respect to inversions of the coor¬ 
dinate system (Sec. 15). 

The much weaker interactions occurring in some elementary 
particle transformations (for example, p-decay) do not satisfy the 
parity conservation law. 


Hydrogenlike Atoms. At the beginning of this section it was pointed 
out that alkali-metal atoms somewhat resemble the hydrogen atom. 
The outer electron in these atoms is relatively weakly bound to the 
atomic core, which consists of the nucleus and all the remaining 
electrons. The wave functions for electrons of the atomic core differ 
from zero at smaller distances from the nucleus than the wave func¬ 
tion for the outer electron, so that the core, as it were, screens the 
nuclear charge. The field in which the outer electron moves is ap¬ 
proximately Coulomb, provided only that it is not situated in the 
region of the core. It is for this reason that the spectra of alkali- 
metal atoms resemble the hydrogen-atom spectrum. The energy 
levels of these atoms, which are due to excitation of the outer elec¬ 
tron, are given by the equation 


E 


Til 


me* 1 

2 W [rc-f-A(Z)] 2 


(29.51) 


where the correction A (l) depends upon the orbital quantum number. 
It accounts for the deviation of the field from a purely Coulomb one 
at small distances from the nucleus. The greater l is, the farther from 
the nucleus is the outer electron (according to (29.20)) and the 
smaller is A(Z). 

Thus, the energy levels of alkali metals, like the energy levels 
of all atoms except hydrogen, depend upon n and Z. 


EXERCISES 

1. Show that in the definition (29.3) the spherical function coincides 
fully with the Legendre polynomial (29.5) of u = cos d. 

Solution. Since it has already been shown that the functions (29.3) 
and (29.5) are proportional, it is sufficient to prove their identity for some 
value of u = cosO. We shall show this for cos d = u = 1. In the definition 
(29.5), at u = 1 only the term that is obtained from Z-fold differentiation 
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of the binomial (u 2 — 1)1 is not zero. Each differentiation yields a factor 
equal to the power of the binomial, and a 2 from the differentiation of the 
square. These factors cancel out with the denominator 2*/!, so that Pi (1) = 1. 
Now let us prove that Y\ (1) is also unity. 

In the definition (29.3) we shift the origin of the coordinate system by 
the quantity £ in the direction of the z axis so as to make r" 1 under the deriv¬ 
ative sign equal to [ x 2 + y 2 + (z — Q 2 ]" 1 / 2 . For small £’s we may leave r 
under the derivative sign equal to [ x 2 + y 2 + z 2 ] 1 / 2 . 

We divide both sides of the equation by r*, multiply by £*, put £ = 0 
in the expression for the derivative, and sum over l from 0 to oo. Then in 
the right-hand side we have a Taylor series of the quantity [ x 2 + y* + 

+ (* - D 2 !- 1 /*. 

Hence, Y p (cos ft) is the expansion coefficient of (£/r)* in the Taylor 
series of [s 2 + y 2 + (z - Q 2 ]- 1 / 2 : 

r [x*+^ + (*_5)2]-l/2 =s ^ 1 _ 2 l.cosd+-J-]" 1/2 

OO 

= 2 (4) ,y *(co»«) 
1=0 

But for cos ft = 1 we have simply 

1=0 

Hence, the required coefficient is equal to unity, as was asserted. 

2. Prove that three successive Legendre polynomials are related by 
the formula 

(2Z+1) uP t (u) = (l+ 1) P M (u) + lP^ (u) 

Solution . Using the result of the preceding exercise, we write 

oo 

(1 — 2pu + p2)- 1/2 = 2 plp t ( u ) 

1=0 

Differentiating this equation with respect to p, multiplying by 2p, and 
adding the initial expression, we obtain 

2 (21 +1) pip, (B) = (1 - p*) (1 - 2 pu +p2) - 3/2 
l 

Substituting l + 1 for the summation index l in the derivative with respect 
to p, we obtain 

2 (J + l) (**) = (»—p) (1 —2pu+p2)- 3/2 

i 

Now we perform the following operations: multiply the derivative 
with respect to p by p 2 , add the initial expression multiplied by p, and 
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replace the summation index l by /— 1. We then obtain 

oo 

2 ip'p,. 1 (a)=(p- P * U ) (i-2 P u+p*r 3 / 2 

1=0 

We see that the sum of the two latter equations coincides with the first, 
multiplied by u . Comparing the factors of the same powers of p, we arrive 
at the required equation. 

It is apparent that this equation could not involve a polynomial of 
order higher than l + 1, because the power of P/ is equal to l. Furthermore, 
since multiplication by u changes the parity of P t , the equation holds only 
for polynomials of different parity on the left and on the right. 

3. Develop and normalize the wave functions in a hydrogen atom with 
1=0, 1, 2 and n = 1, 2, 3. Take advantage of the fact that 

oo 

J e~ x x n dx = n\ 

0 


30 


ELECTRON SPIN 

From Eq. (29.41) the ground state of a hydrogen atom has the prin¬ 
cipal quantum number, n, equal to unity. For n = 1 the orbital 
quantum number, l, and the radial quantum number,. n r , must be 
equal to zero, since n = n T -f l -f 1 , and n T and l can in no way be 
less than zero. The ground state of a hydrogen atom is thus the s state. 
The orbital motion of an s-electron do'es not produce a magnetic 
moment because the magnetic moment is proportional to the angular 
momentum. Yet, if the Stern-Gerlach experiment is performed for 
atomic hydrogen, the atomic beam will split, but only into two parts. 
However, when l = 0, as we have already said, there should be no 
splitting due to orbital angular momentum, while for l = 1, the 
beam should split into 3 beams corresponding to the number of 
projections of the angular momentum k = —1, 0, 1. 

The same results if, instead of hydrogen, we take an alkali metal. 
The electron cloud of any alkali metal consists of an atomic core 
in the S state, that is, one lacking orbital angular momentum, and 
one electron in the s state. In this sense, alkali-metal atoms resemble 
the hydrogen atom. 

For this reason, the state of the atom is not described by the three 
quantum numbers n, l, and k. One more quantum number must be 
stated, with respect to which splitting of the beam occurs. 



384 


Fundamental laws 


Intrinsic Angular Momentum, or Spin, of an Electron. Obviously, 
the additional quantum number must be associated in some way 
with the angular momentum of the electron, since it leads to a 
splitting of the beam in a magnetic field. Splitting into two beams 
can be accounted for only by an angular momentum whose greatest 
projection is equal to h! 2. Then it has only two possible projections, 
hi2 and —hi2 . 

The Stern-Gerlach experiment was given only as an example. In 
fact, not only this experiment, but the whole enormous aggregate 
of knowledge about the atom indicates that the electron possesses 
an angular momentum hi2 that is not related to its spatial, or as 
it is conventionally called, orbital, motion. This angular momentum 
is termed the spin. It can be said that in the classical analogy an 
electron is like a planet, which has an angular momentum due not 
only to its revolution about the sun, but also to rotation on its own 
axis. 

The analogy with a planet is not far-reaching since the angular 
momentum of a rotating rigid body can be made equal to any value, 
while the spin of an electron always has projections ±h!2 and no 
others. Therefore, spin is a purely quantum property of the electron; 
in the limiting transition to classical mechanics it becomes zero. 
We must not^take the word “spin” too literally, for the electron 
actually does not resemble a rigid body like a top or a spindle. 
The analogy ^between an electron and a top consists only in that 
their motion is not described solely by the spatial location of one 
point, and they possess an internal rotational degree of 
freedom. 

There is, rather, a certain analogy between the electron and the 
quantum of light: as was shown in Section 19, an electromagnetic 
wave possesses an internal polarization degree of freedom. Two 
waves of identical phase may possess different polarization. If we 
liken the coordinate-dependent phase with the spatial argument of 
the wave function of the electron, the polarization degree of freedom 
of the wave can be likened to the spin degree of freedom of the quan¬ 
tum. But the two are far from identical: wave polarization is a classi¬ 
cal concept, while spin is a quantum concept. 


The General Definition of Angular Momentum. Since spin is not 
associated with the spatial motion of the electron, definition of the 
angular momentum due to it according to the operator formula 

M = r X p is, evidently, unacceptable. A more general definition 
is required, which would be valid for all cases, especially, of course, 
one that would describe the property of angular momentum as an 
integral of the motion. 



Quantum mechanics 


385 


In classical mechanics, the angular momentum of a mechanical 
system is defined (for one projection) as 

«.=-£■ i 30 - 1 ) 

Here, S denotes the action of the system, and 6cp is an infinitesimal 
angle of rotation about the z axis. The rotation 6cp stresses the fact 
that it possesses a somewhat different meaning than in the previous 
definition'of angular momentum (see Secs. 5 and 24): cp denotes not 
the azimuth angle of a separate mass point, but the angle of rotation 
of a rectangular coordinate system. In the present case the rotation 
is infinitesimal. 

If we take a system of mass points, it is obvious that a rotation of 
the coordinates through 6(p is equivalent to a displacement of each 
point by 6cp along the azimuth, so that in this case definition (30.1) 
yields the projection of the total angular momentum as an additive 
integral of the motion. The same is true of the angular momentum of 
a rotating solid body, all points of which rotate through the same 
angle. 

Angular momentum as an additive integral of motion of a closed 
system exists only because there exists the symmetry of relative 
rotations, which we called the isotropy of space in Section 2. The 
general definition of angular momentum in quantum mechanics can 
also be linked with such symmetry, irrespective of whether the 
angular momentum is due to the spatial displacement of a mass 
point or to some internal degree of freedom that cannot be described 
with the help of conventional coordinates. The correctness of such 
a generalization is verified, as always, by experiment. 

If the angular momentum is due solely to spatial motion, it is 
more convenient to proceed from the classical equivalence between 
wave function and action: 

\J) = giS/h 

(see (23.5)). We find the change in ip in an infinitesimal rotation of 
the coordinate system: 

= 6e iS/h = (-j-) 8S = -i- MM (30.2) 

where we made use of Eq. (30.1). 

The generalization consists in replacing M z by a certain angular 
momentum J z of arbitrary origin. Then (30.2) must be rewritten 
in the form 

4 < 30 - 3 > 

where is the wave function, which is no longer associated with the 
action since it may involve other variables besides spatial ones. 

25-0452 
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We shall treat Eq. (30.3) as the quantum definition of the opera¬ 
tor /, in other words, we pass from the magnitude of the angular 
momentum to the operator of the angular momentum. Let us show 
that in this choice of the angular momentum operator its projections 
satisfy the conventional commutation rules (24.24), which were 
developed for the projections of the angular momentum of spatial 
motion of a particle. They are termed the orbital momentum pro¬ 
jections (the term “orbital” goes back to Bohr’s theory, •which as¬ 
sumed the existence of orbits). 

Every operator is defined with respect to some entity on which it 
operates: in geometrical space, for example, this may be a vector, 
in Hilbert space a state vector, that is, a wave function. It is con¬ 
venient to begin by obtaining the commutation relation for the 

operators J x , J y , J z in their operation on the radius-vector com¬ 
ponents x , y, z. 

If we introduce three unit vectors n (1) , n (2) , and n (3) , then, as 
we know from Sections 8 and 9, the change in the radius vector in 
a rotation of the coordinate system is 

6 $r = n (l) 6 cpi X r (30.4) 

where 6 (p* is an infinitesimal rotation around the ith axis. 

We rotate the coordinate system once more about the kth axis 
(k i); then the change of 6 ft r in the new rotation is 

8 ft (5;r) = n‘ ft > 6 cp ft X 8 i r = [n < ' ,) X ( n(i) X *)] $<Pi &<?h (30.5a) 

Now let us perform the same operations in reverse order: 

3; (V) = n (i) &Pi X V = [n (l) X (n (fe) X r)l 6 <p, 6 <p ft (30.56) 

and find the difference between both changes in the radius vector: 

8 i ( 6 ft r) — 8 ft ( 6 jr) = [n (i) X (n (ft) X r) —n (ft) X (n (t, X*)]5<Pj 6 <Pk 
= [(n <i) -n !ft) ) (r—n (fc > (n (i >-r) 

— (n (t > • n (ft) ) (r + n <l) (n <ft) • r)] 6 <p f 8 q>fe 

= — l(n (i) X n(ft> ) X r] 6 <Pft (30.6) 

But if the numbers of the unit vectors follow in cyclic order, the 
vector product n (l) X n(ft) = n<Z) » where i = 7 ^= k =^=JZ. Consequently 

6 » (V) - ( 6 ,r) = - (n“> X r) 8 <p £ 6 (p ft (30.7) 

In other words, rotations about different Cartesian axes do not 
commute. But it can be readily observed that the commutation rela¬ 
tions for rotations about different axes are the same as the commuta¬ 
tion relations for the respective components of angular momentum. 
Indeed, if we compare the rotation through angle 6 cp ft about the 
kih axis with the operator ( ilh) J h 6 cp ft in the preceding equation, we 
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have, after cancelling out 6cpi6qp ft (we assumed that 6qpj = 6<Pid<Pfc) 

j i j k — j k j i = ihj l (30.8) 

where i k 1. 

We have thus obtained a commutation relation for the operators 
of the projections of angular momentum of arbitrary origin, irre¬ 
spective of the degrees of freedom to which they refer. We should 
assume that the commutation relations (30.8) are also valid for the 
operation of the operators / x , J y , and J z in a Hilbert space of state 
vectors. A similar assumption was in effect made with regard to the 
operators of the orbital angular momentum projections, and to all 
operators in general. 

Only on the assumption that the operators of the angular- 
momentum projections always satisfy the commutation relations 
(30.8) can angular momentum be an additive integral of the motion 
of a closed system. If the orbital angular-momentum projections 
satisfied one set of commutation relations, while the spin angular 
momentum projections satisfied another, then the total angular 
momentum would be subject to some quite special commutation 
relations. In particular, it would commute with the Hamiltonian 
differently than with the orbital angular momentum, which would 
violate its conservation conditions. But since experience shows that 
the total angular momentum of a closed system is conserved for the 
same conditions as those for which the orbital angular momentum 
alone is conserved in a system not possessing spin, it is apparent 
that (30.8) are the only possible commutation relations between the 
angular momentum components. 

Note also that in a system possessing spin the strict conservation 
law holds only for the total angular momentum J, which is compound¬ 
ed of the orbital angular momentum and spin according to the rules 
of vector addition (Sec. 29). The individual angular momentum 
components of motion can be regarded as integrals of the motion only 
approximately, just as the orbital quantum numbers of separate 
electrons were formally treated as integrals in finding the total 
orbital angular momentum of the atom. 

The Eigenvalues of the Angular Momentum Square and Projection. 
It follows from the relations (30.8), as earlier from (24.24), that 
angular momentum commutes with any of its projections. We shall 
now proceed only from the relations (30.8) to find the eigenvalues 
of the square of the angular momentum and one of its projections. 

Take some state of a system characterized, for the time being, 
by an unknown eigenvalue of the square of the angular momentum, 
/ 2 , and its projection. Although we have not yet determined these 
numbers, we can legitimately consider the operators corresponding 

25 * 
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to them in the state being examined to be diagonal: 

J* = /25 w> ,5^ (30.9) 

Jz = j z 6j2J2'6^ j» (30.10) 

We start with the operator expression 

= + + 

from which we develop the relation between the mean values: 

(J 2 ) = {JD + (Jl) + < j \> (30.11) 

But in the eigenstates of operators the means of the corresponding 
quantities are equal to the eigenvalues in those states, so that 

/»-/* + </*> + </■> (30.12) 

It is apparent that the mean values of the squares, (J x ) and (/J), 
cannot be less than zero, and therefore 

/ z 2 </ 2 (30.13) 

or 

— VP<J Z <VP (30.14) 

We shall measure the angular momentum in natural units of h , 

returning when necessary to conventional units by multiplying by h . 
Let, then, the maximum absolute value of the angular momentum 
projection be /. Instead of (30.14) we write, using now the sign ^ 
instead of <: 

—/ < Jz < / (30.15) 

We saw from the example of orbital angular momentum that 

YP> J z . From (30.15) it follows that the matrix elements of any 
quantity depending upon J z must, in the state with given / 2 , vanish 
when even one of the indices / z , J z becomes greater than /. We apply 
this to the matrix elements of the operators J x ± iJ y . 

We take two commutation relations of form (30.8): 

J X J z J Z J x = IJy 


J yj z J Z J y - iJ x 


Multiply the second by ±i and add to the first to get 

{jx i iJy)Jz Jify) = "F (Jx i iJy ) (30.16) 


Since all three operators / x , J yy and J z commute with / 2 , the 
respective matrices are diagonal in / 2 . They have different indices 
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only with respect to J z . Forming matrix elements of both sides of 
Eq. (30.16), we obtain 
j 


2 l(Jx =t= i"Jy)j J" {Jz)fj' j" (Jx iJy) j'j'] 

J»_J z z z z z z z 


= ^f(Jx±iJy)j J- (30.17) 

z Z 


We now take advantage of the fact that the matrix (J z )j j* is 

itself diagonal in the representation in which J z is the independent 
variable. This is expressed by Eq. (30.10), with the help of which 
we write in simplified form: 

(^z)j^j z = = ^ Z ^J Z J Z 

Substituting this into (30.17) and collecting all the terms of the 
equation in the left-hand side, we obtain 


(Jx^ziJy)j j' {fz — «^z=t 1) = 0 (30.18) 

z z 

Since one of the two factors in the left-hand side of this equation 
must be zero, we see that J x -f iJ y possesses only such matrix 
elements for which the column number is one less than the row 
number (J z = J z — 1), while J x — iJ y possess only such matrix 
elements for which the column number is one greater than the row 
number (J z = J z + 1). At the same time, the eigenvalues of J z 
vary only by unity (expressed in terms of h\) and pass through num¬ 
bers from — J to J. But in that case there are two and only two 
possibilities: J is either an integer or a half-integer (1/2, 3/2, etc.), 
since only integers or half-integers can, when reduced by an integral 
number of units, transform into themselves with the reverse sign. 

Knowing which matrix elements J x ± iJ y are nonzero, let us 
now develop the matrix elements of (J x — iJ y )(J x + iJ y ). We 
begin by expanding the operator products: 

(J x iJy) (J x “h ify) = Jx H" Jy ^ (J xJy ^y^x) 

= j 2 x + J5-j z = J 2 -f 2 z-J z (30.19) 

Since we have a diagonal matrix on the right, the matrix on the 
left is also diagonal. We write the diagonal element corresponding 
to this matrix, 

(Jx-iJy)j t 'J z +1 (Jx + Vy)j z +1 j=J 2 -J\-J z (30.20) 

and apply Eq. (30.20) to the case when J z is equal to its maximum 
value J. Then in the left-hand side both matrix elements involved 
in the product contain the indices / + 1. Hence, they are both equal 
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to zero, and we obtain the equation defining the eigenvalue of the 
angular momentum square: 

r = J (J + i) (30.21) 

We should not be misled by the symbolic form of this equation, 
for we agreed to treat / 2 as the eigenvalue of the operator / 2 , and J 
as the maximum projection of the angular momentum on the z axis. 

Unlike orbital angular momentum, total angular momentum can 
have both integral and half-integral values of /. Both are compatible 
with the commutation relations (30.8) 7 . 

Then it turns out that the angular momentum of a certain system 
can assume only integral or half-integral values, because J z varies 
each time only by unity. 

Let us now find the matrix elements of the angular momentum 
components. For this we note that 

(J x — iJ y )j z j z +\ = (J x + lJy)j z + i j z 


because J x and J y are Hermitian operators. 

With the help of (30.20), this yields 

\j x -\-iJy \j z +i J Z = J (J + 1 ) J z (J Z + 1 ) 

= (J-Jz)(J + J 2 + 1) (30.22a) 

This matrix element becomes zero at J z = —(J -f 1), as it should. 
Similarly 

\Jx — lJy\j z J z +1 = J (J + 1) — Jz (Jz— 1) 

= (J + Jz)(J-Jz + 1) (30.226) 

Here we obtain zero when J z = / + 1. 

7 Let us trace in greater detail how half-integral eigenvalues are eliminated 
J n the case of orbital angular momentum. Let k = l, which corresponds to 

J z = /. To operator J x + iJ y there corresponds e l(f> — cot ft y j = L + 

(see Exercise 5, Section 24). Operating on Y\ = p\ (cos ft) we find that 
this operator should yield zero, as otherwise it would have a matrix element 

with indices Z, l + 1, which must be equal to zero. Hence — l cot ft j P\ = 0 

and P\ = C (sinft)L If, for example, l = 1/2, then = (sinft) 1/2 e l(p / 2 . 

( d 1 d \ 

— + cot ft ——j == 

== should yield the function Y - J^|, since at l = 1/2 it has only this matrix 
element not equal to zero. But L.Yj^ ¥= so that the value l = 1/2 is 

precluded for the orbital angular momentum. 
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If we define ( J x + iJ y ) as a matrix with real elements, then the 
matrix ( J x — iJ y ) must also have real elements: on this condition 
the commutation relations between the two matrices will correspond 
to (30.8). Therefore 

(J x + iJ y )j z j z +i = l(/ — J z ) (J + J z + 1)] 1/2 (30.23a) 

(J x + iJyh z J z+ i = [(J + Jz)(J-Jz+ 1)1 1/2 (30.23ft) 

Matrix (J x -f iJ y ) bas elements other than zero only next to the 
principal diagonal, to the right of it, while matrix (J x — iJ y ) 
has them only to the left of the principal diagonal, and also next to 
it. We shall require these matrices for J = 1/2. For this value of 
the total angular momentum its projection J z takes only the values 
1/2 and —1/2. Accordingly, we obtain the following matrix elements: 

(J x + iJy)i/2 - 1 / 2 = [(t+t) (t — T + 1 )] =1 

(30.24 a) 

( Jx - i/2 = [(y+t) (t _ T +1 )] ' =1 

(30.24ft) 

Each matrix has two rows and two columns. 

From these matrices it is not difficult to find the matrices of the 
projections J x and J y . They are 

'.-•!(? 5 ) < 30 - 25<,) 

-') ( 30 . 256 ) 

Multiplying J y by and adding with J x , we return to (30.24a) 
and (30.246), which justifies Eqs. (30.25a) and (30.256). 

We must supplement the matrices of J x and J y with that of J z . 
Since it is by definition diagonal, it should be written in the form 

'.-r(2 J) < 30 - 25c > 

The Spin Variable of an Electron. We must now develop the state 
vector, or the wave function, on which the operators (30.25a)-(30.25c) 
will operate. It is apparent that these wave functions depend not on 
the spatial coordinates but on a special variable describing the spin 
degree of freedom. But since the matrices corresponding to angular 
momentum J = 1/2 have only two rows and two columns, the 

corresponding state vector has only two components. Let us call 
them ^ and \p 2 . 
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Let us find the eigenfunctions corresponding to the projections of 
spin ±1/2. For this it is convenient to write them in a column: 

♦ “(£) (30.26) 

These functions, taken together and written in Eq. (30.26) in the 

form of a single function if, must be eigenfunctions of the operator J zy 
defined as the matrix (30.25c): 

1(1 J) (*)«./,(£) (30.27) 

Since J z is a diagonal operator, the form of the eigenfunctions is 
easily determined: 

J z = 1/2, *1/2 =(J); J z= — 1/2, *-1/2 = (i) 

(30.28) 

that is 

^ 1/2 1 = 1 , ^ 1/2 2 = 0 , ^_ i / 2 i = 0 , ' 4 5 - 1/2 2 = 1 

We thus arrive at a notation for the eigenfunctions of J z which 
is quite similar to the general notation for the eigenfunctions of any 
operator, for example the operator % in terms of x : ij? (X, x). 

For the independent variable x we have the indices 1 and 2, and 
for the spin operator eigenvalues, 1/2 and —1/2. As distinct from 
the variable x , which assumes a continuous set of values, the spin 
variable has only two values (1 and 2 ), but nevertheless all the 
formal demands imposed on operators and eigenvalues in quantum 
mechanics are satisfied in the case of spin. For example, it can be 
seen from the notation of the operators J x , J y , and J z that they are 
Hermitian. The wave functions corresponding to different eigen¬ 
values of the operator J z are orthogonal. This can be seen directly 
from (30.28): if we denote the spin variable s (s = 1 , 2 ) and the 
eigenvalues J z (J z = 1/2, —1/2), the orthonormality relations 


2 1 > 5 , 


(30.29) 


are satisfied. 

Thus, the eigenvalue of the spin projection should be seen as a 
fourth quantum number in addition to n, Z, and k. In order to avoid 
writing the fraction 1/2 every time, usually only the sign of the 
spin projection, denoted by the letter a, is stated. In normalizing 
an arbitrary wave function, the square of its modulus is integrated 
over the spatial variables and summed over the spin variable s . 
For example, the orthonormality condition in a central field should be 



Quantum mechanics 


393 ^ 


written as 


2 J tyn'l'h'o'tynlko dV = ^nn'^ll'^hk'^oo' (30.30)* 


Pauli Spin Matrices. Having agreed not to write the magnitude 
of the spin projection, only its sign, it is convenient to introduce in 
place of J x , J v , and J z three such matrices: 



(30.31) 

which were introduced by Pauli for the quantum mechanical descrip¬ 
tion of electron spin. Let us examine the properties of these matrices 
more closely. 

Note, first of all, that each of them has one nonzero element in 
each row and in each column. This makes it possible to represent 
the operation of such matrices on the function (30.26) in the form 
of a substitution: 

(SMS) 

ns)- °m-S) 

It is easier to find the rules for multiplying the matrices with the 
help of substitutions than in the general way; for example 

< 30 ' 33 “> 

Obviously, this equation should be symbolically written as 
Ox = 1. The unity here denotes 6 SS *. In the same way we find that 

fa-;, (-£)=(/(-‘i/E )=($)“* < 3033f -> 


(30.32) 


and finally 
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Now, in the same way we find the paired products of the Pauli 
matrices: 



(30.34a) 


(30.346) 


(30.34c) 


All the obtained equations can be written as one equation in tensor 
form (see Sec. 11): 


<7l^Tn = 6lm + eimnton (30.35) 

This means that any expression that is quadratic with respect 
to the Pauli matrices can be reduced to a linear expression. For 
example 


(A-a) (B-a) =A t Oi X B m o m = A l B l + iE lmn A l B m o n 

= (A-B) + i (A X B) a (30.36) 

Of course, here both Pauli matrices operate on the spin variables 
of the same electron. 

The Vector Properties of Pauli Matrices. We shall show that the 
Pauli matrices can be treated as vector components. For this we 
must verify that in rotations of the coordinate system they transform 
like vector components, that is, that the transformed operators pos¬ 
sess the same properties as the initial ones. 

In Section 9 we introduced symbols for the cosines of the angles 
between the old and new coordinate axes, namely, the cosine between 
the old axis a and the new axis a' was denoted (a', a). Then the 
components of the transformed vector are expressed in terms of the 
vector components with respect to the old axes in the following way: 

°a' = (a', a) CT a , <Tp. = (P', P) CTp (30.37) 

We must prove that their product yields a result similar to (30.35). 
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Multiplying a by ofr, we obtain 

°'a’° P' = ( a '> a ) (P'» P) CT a CT P 

= (a', a) (P', P) (6 a p + ie a fiyOv) 

= + »(«', a) (P', P) e<xe v a Y 

But e a p Y is an invariant tensor and is the same in any coordinate 
system. The same can be said if one of the tensor indices (y) is con¬ 
tracted (summed over) with the index of some vector, for example, a v . 
This follows simply from the properties of the transformation coef¬ 
ficients (a', a), . . (y\ y). Therefore 

(CC , oc) (P , P) E a ^yOy = Ca'P'Y'^Y' 

Thus 


CTa'Op' — “h ^a'p'Y'^Y' 

the same as for the old components cr a , a ft . 

The transformation formulas for a rotation around one axis are 
conveniently written in explicit form: 


o' x == Qy cos <p + o y sin cp 
o'y = — o x sin cp -f- Oy cos cp 


(30.38) 


We shall use these formulas to find the transformation law for the 
wave function components ilv 

It was shown in Section 26 that the transformation of a rotation of 
coordinate axes belongs to the class of unitary transformations. 
Equations (30.37) and (30.38) express unitary transformations of 
operators. But it is of interest to obtain the corresponding unitary 
transformation of the wave functions themselves. 

We begin with the simplest transformation: a rotation around the 
z axis, which for operators has the form (30.38). We shall proceed 
from the standard wave functions (30.28), which we shall denote 
simply \|) + and ij). so as to avoid writing fractions in the index. 
An arbitrary function of the spin variable can be represented as 
a linear combination of and ij)- thus: 

ij) = #ii|5+ + (30.39) 


Function is, of course, also a two-component one, its first com¬ 
ponent, as can be seen from (30.28) and (30.39), being equal to Xi, 
and its second to x 2 . 

We now carry out a rotation of the coordinate system around the 
z axis through an infinitesimal angle 6cp. Then, as we know from 
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the definition of angular momentum (30.2) or (30.3), 

= y 6x|:_ = — y 6cpi|)_ (30.40) 


Hence, for 6\|) we have 


(30.41) 


Integrating this relation, we obtain the transformation formula 
for \|/ for a finite rotation angle cp: 




x; 

*2 


x t e i(l>/2 

x 2 e~ i(f>/2 


(30.42) 


Thus, the two-component function ^ j transforms through one- 

half the rotation angle. Note that | | 2 = | x\ | 2 -f | x 2 | 2 = 

= | ^ | 2 = | X! | 2 + | x 2 | 2 , as should be in a ui^itary transforma¬ 
tion retaining the normalization of the wave function. 

Let us find the more general form of the unitary transformation 
of the wave function ip, which we represent with the help of four 
numbers, a, p, y, and 6: 


x\ = ax t + $x 2 


*2 = T^l + & X 2 


(30.43) 


The complex conjugate quantities x\*, x' 2 * can be expressed 
respectively in terms of a*, P*, y*, and 6*. Let us require that 
| 'll)' | 2 = |^| 2 . On the transformation coefficients this imposes the 
following four conditions, which appear from a comparison of the 
expressions multiplying | x Y | 2 , | x 2 | 2 , x x x*, and x*x 2 : 

a* a + y*y = 1, a*P + y*6 = 0 

p*p + 6*6 = 1, ap* + y6* - 0 (30.44) 


From the conditions (30.44) there follows the relation between a r 
P, y, 6 and the complex conjugate quantities a*, p*, y*, 6*: 



(30.45) 


where D = a6 — Py, that is, the determinant of the transformation. 

Equation (30.45) denotes that the complex conjugate transforma¬ 
tion matrix is equal to the inverse matrix, but this is precisely the 
condition for unitarity. Forming the determinants of both sides 
of (30.45), we find that D* = D" 1 . We can put D = 1 without re¬ 
stricting the generality; then 

ct = 6*, P = — y* 


(30.46) 



Quantum mechanics 


397 


Let us apply the obtained relationships to a rotation of a coordi¬ 
nate system through an angle $ about the x axis. Suppose the projec¬ 
tion of the spin on the initial z axis was -f 1/2. The mean value of 
the spin projection on the new axis is equal to cos ft, since the rela¬ 
tions between mean values in quantum mechanics are the same as 
between the values themselves in classical mechanics. Since the 
basic properties of operators are conserved in a rotation of the coordi¬ 
nate axes, we assume that the operator of the spin projection on the 
new axis z (o' z ) has the form (30.31). Let us find the expressions for 
the wave functions transformed in the rotation. 

If the wave function of the state was = (q) with respect to 

the old coordinate system, with respect to the new axes we should 
have, from (30.43), 

tyl=a, = Y. V = ( * ) 

and for the complex conjugate we obtain, with the help of (30.46), 



Both these expressions should be substituted into the definition 
of the mean value of the angular momentum projection on the new 
z axis: 

(o'z) = 2 = cos # (30.47) 

s 

Using the known form of the operator a z and the expressions for\|/* 
and \|/, we determine from (30.47) 

a6 -f Py = cos ft 

Since a6 — Py = D = 1, we obtain the second equation relating 
the transformation coefficients: a6 — Py = 1. Adding and sub¬ 
tracting these equations, we find 

a6 =(1 + cos ft) = cos 2 y , Py = —sin 2 y 
Applying (30.46), we obtain 

a*a = | a | 2 = cos 2 , p*p = | p |* = sin 2 

We assume that a is a real quantity equal to cos (ft/2). Then, 
to satisfy the condition a6 — Py = 1, we must consider p to be 
a purely imaginary quantity i sin (ft/2). The correctness of this 
choice of a and p is seen from the following. We find ( o y ), which 
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evidently must be equal to sin {h On the other hand we have 

( Oy ) = 2 ^* 0^8 = — 1^2 + ^ 2^1 

f 


O . 0 0 . a 

= 2 sin -y cos -y = sin u 


(30.48) 


We have thus obtained a matrix describing the rotation about the 
x axis: 

( ft . . 0 \ 

C0S T lsln -2\ 

lSin _ C0Sy J 


The most general form of the matrix is obtained by performing 
one more rotation about the new axis z through an angle %. The 
three rotation angles cp, d, and % are essentially the Euler angles 
(Sec. 9). Multiplying all three matrices obtained for each rotation 
separately, we obtain the most general rotation matrix: 


jeW 2 

l 0 



isin-j 

COSy 


/gi(p/2 

\ 0 


0 ) 

g“«P/2/ 


( cos -y g^+x)/ 2 i sin -y ^(x-<p)/2 
i sin y- e*(<p-x)/2 C os -y g- i (v+x)/2 


(30.49) 


Such a matrix has a certain analogy with the matrix of cosines 
(X, pi). But unlike the latter, it operates not on the components of 
a vector but on the components of a special quantity, called a spinor. 
A spinor is a two-component complex quantity with a transforma¬ 
tion law described by matrix (30.49), whereas a vector is a three- 
component quantity which transforms in rotations of the coordinates 
with the help of a matrix of cosines. 

Since the spinor transformation matrix is expressed in terms of 
half-angles, the spinor does not revert to its initial value in a rota¬ 
tion of the coordinate axis through 360°. It is clear from this that no 
classical directly measurable quantity can be expressed linearly in 
terms of a spinor. Only bilinear expressions of the type (30.47) and 
(30.49) are possible. Such quantities, naturally, assume their initial 
values in a rotation through 360°. But only they can be measured in 
quantum mechanics, and by no means the wave functions themselves. 


The Operator of the Total Angular Momentum of an Electron. We 

examined the spin of an electron apart from its orbital angular 
momentum. Let us now form the vector operator of the total angular 
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momentum of an electron: 

j = M_|_-io (30.50a) 

or in terms of the components 

ix = M x -\ 2* (J Xl j y My -| 2 Gy, j Z M z -\-~2 0 z 

(30.50 b) 

Operators M and a commute, since they operate on different 
variables: M on a spatial variable, and a on a spin variable. Since 
M t and o t have the same commutation relations, the components 
of the vector operator j have the same commutation relations. 
Furthermore, j is a vector, because M and a transform in the same 
way in the rotations of the coordinate system. 

If the eigenvalue of the square of the orbital angular momentum is 
not zero, that is, if l =/= 0, the eigenvalues of the square of the total 
angular momentum may be equal to either (Z + 1/2)(Z + 3/2) or 
(Z — 1/2) (Z -f- 1/2). In the former case the spin and the orbital 
angular momentum are said to be parallel, in the latter case they 
are antiparallel. 

This leads to a doubling of the number of electron states. Instead 
of the quantum number a we can use the quantum number j = 
= l ± 1 / 2 . 


Spin Magnetic Moment. The spin of an electron, like its orbital 
angular momentum, is associated with a definite magnetic moment. 
But experiment shows that the ratio of spin magnetic moment to 
spin angular momentum is twice as great as for the electron’s orbital 
motion. There is nothing paradoxical in this because the result 
(17.30) can be applied only to orbital angular momentum. At the 
same time we can deduce the spin magnetic moment from the Dirac 
relativistic wave equation for an electron (Sec. 37); in agreement with 
experiment, the relation between the spin magnetic moment and the 
spin angular momentum is 


^ CT= 2mc" a ( 30 - 51 > 

Hence, the projection of spin magnetic moment on any axis is 

eh ■ ° (30.52) 


(Ha) z = ± 


(Here we again measure the angular momentum in conventional 
units.) The quantity |S is called the Bohr magneton . It is a natural 
unit of magnetic moment. 

Since the spin and orbital moments produce their own magnetic 
moments, magnetic interaction occurs between them. D is pro¬ 
portional to the product of the magnetic moments. But the formula* 
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of each magnetic moment involves the velocity of light in the deno¬ 
minator. Hence, the spin-orbit coupling , as it is called, is inversely 
proportional to c 2 , that is, it is a relativistic effect. If the velocities 
of atomic electrons are small in comparison with c, their magnetic 
interaction is small. This is always the case for light atoms. 

Due to magnetic forces, the energy level with j = l -f 1/2 always 
differs slightly from the energy level with j = l — 1/2. In such 
simple form this conclusion refers to a separate electron in a central 
field, for example, in an atom of an alkali metal. 


EXERCISES 

1. Show that the common general property of the Pauli spin matrices, 
a a°fr + °&°a = 26 a p, is conserved in rotations of the coordinate system, 
making use of the fact that 
(a, v) (p, v) = 6 a p 

2. Determine the eigenvalues of the scalar product (a 1 -o , 2 ) for two 
electrons, making use of the fact that a x and <? 2 commute. 

Solution . Since and <r 2 commute, the conventional formula 

(Sl + <*2 ) 2 = i + <*2 + 2 (o*! • or 2 ) 

holds true. 

We know from Section 29 that the sum of two angular momenta varies 
from their sum to their difference, so that the total spin assumes values 1 
and 0 (according to the largest projection). Since or is the double operator 
of spin, the respective maximum projections are all twice as large. Hence, 
when the spins are added like parallel vectors, the eigenvalue of the square, 
(o*i + o’ 2 ) a , is fourfold the corresponding square of the angular momentum, 
that is, it is equal to 4 X 1 X 2 = 8; when the spins are antiparallel it is 
zero. Furthermore, orj = aj = a| = 1, so that the eigenvalue of (o , 1 *o , a ) 
is (8 — 6)/2 = 1 for parallel spins, and (0 — 6)/2 = —3 for antiparallel 
spins. 

3. Determine the eigenfunctions of the operators o x and o y . 

A nswer. 



Thus, the eigenfunctions of all three noncommutative operators are different. 
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THE QUASI-CLASSICAL APPROXIMATION 

Application of quantum mechanics to specific problems frequently 
encounters formidable mathematical difficulties. In such cases it 
may prove necessary to make use of various approximate methods. 
Such methods should not be seen simply as a forced substitution of 
exact solutions. On the contrary, an analysis of the approximation 
makes it possible to gain a deeper insight into the main aspects of the 
problem and distinguish between the main and the secondary. When 
the approximation is found to be lacking we usually find that the 
initial simplifying assumptions were not valid. 

In this section we shall examine an approximate method of quan¬ 
tum mechanics, which is applicable when the problem in hand has 
a close classical correspondence. 

The Quasi-Classical Approximation. It was shown in Section 23 
that the limiting transition from quantum to classical mechanics is 
achieved by the substitution 

y = e is ' h (31.1) 

where S is the action of a particle. In order to perform the limiting 
transition we must formally assume h = 0. Suppose, now, that the 
action quantum is small, but finite, in comparison with the charac¬ 
teristic quantities of the action dimensions in the problem in hand. 
Then S cannot be considered strictly equal to the classical action and 
should be represented as an expansion: 

5 = jl> o -f- hS | -f* h?S 2 -f*... (31.2) 

Here, S 0 is the classical action, while all the other terms are quantum 
corrections. In specific problems we usually have to deal, in addition 
to S 0 , with only the first term of the expansion S i, because if other 
terms are of the same order as the first and second terms, then the 
approximation itself is not justified. 

Substituting the expansion (31.2) into Eq. (31.1), and then into the 
Schrodinger one-dimensional equation, we find S 0 , Si, S 2 , . . . for 
motion with one degree of freedom. In the case when the variables 
separate in the Schrodinger equation for a system with several 
degrees of freedom, as, for example, in a central field, substitutions 
similar to (31.1) and (31.2) can be performed for each degree of free¬ 
dom separately. 

26—0452 
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After cancelling out \|) = we obtain from the Schrodinger 
equation the exact equation for S, as yet without approximations: 

(S'f + ^-S" = 2m (E—U) = p 2 (31.3) 

We substitute the series (31.2) into this equation and collect the 
terms multiplying the same powers of h to get 

( 1 s;)2_p 2+A ( 2l s; < s; + }^) 

+ W- ( 25^; + ) 2 +-} S; ) + ...= 0 (31.4) 

Equating the factors multiplying successive powers of h to zero, 
we find the equations for S 0 , S u S t , . . ., in which the classical 
momentum p = [2m ( E — is taken as the known quantity. 

Note that in each new approximation there appears a corresponding 
new function, so that in principle they may all be defined succes¬ 
sively: 

S' Q = ±p (31.5) 

S[= T If : = T ( ln ^ = * ( ln (31.6) 

5;=!d=- J ^ r {[(lny7)T-(ln/F)'} (31.7) 

It can be seen from the latter equation that the approximation 
becomes invalid when the classical momentum vanishes, that is, at 
the turning points 8 E = U (x). Then the third term of the expansion 
becomes infinite. Furthermore, all the expansion terms become large 
when the derivatives p\ p", . . ., expressed in terms of the poten¬ 
tial energy derivatives [/', U", . . ., are large. In other words, for 
an approximation to be valid, the force F — — U' and its coordinate- 
derivatives should not be large. Expansion (31.2) can be used for a 
sufficiently smooth potential energy curve. This assumption is 
usually satisfied in real problems. But almost always one encounters 
the turning points in whose neighbourhood additional investigation 
is required. 

We first suppose that a solution is found for such x's that are 
sufficiently far away from the turning points. Then, retaining the 
first and second terms in (31.2), we obtain 

= exp ( ± y £ p dx— In Yp ) 

P dx ) ( 31 * 8 > 

8 At these points a particle moving according to classical laws should have 
to alter the direction of motion, or turn. 
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This solution coincides, up to the factor p -1/2 with the conven¬ 
tional representation of the wave function in terms of the action: 

oj) = e i8,h , S = j p dx. The two signs in the formula correspond 

to waves travelling in both directions. The approximation (31.8) 
is termed quasi-classical. 

The Solution in the Classically Inaccessible Domain. Strictly 
speaking, however, the term can be applied only to those spatial 
domains where the momentum p is a real quantity. In those regions 



where U > E the momentum becomes imaginary, and all that re¬ 
mains is a purely superficial similarity between quantum and clas¬ 
sical formulas. In the problem on a potential well of finite depth 
(Sec. 28) it was shown that there is a finite probability of a particle 
occurring in a classically inaccessible domain, where U > E. For 
the case of a rectangular well, the solution is achieved with the 
help of boundary conditions, making it possible to match the wave 
function for domains where E > U and U > E. We shall now con¬ 
sider a typical problem in which it is necessary to match the wave 
functions of a quasi-classical approximation between the domains 
where p 2 > 0 and where p 2 < 0. 

Figure 35 presents the potential energy when the wave function is 
of special interest in the classically inaccessible domain. The total 
energy of the particle is less than the maximum value of the poten¬ 
tial energy. The wave function of a particle located to the left of the 
hump cannot decrease to zero along the finite distance from point xt 
to point £ 2 . Consequently, neither is the wave function zero to the 
right of point x 2 , where the total energy is again greater than the 
potential energy. Here, the wave function ceases to fall off at all, so 
that the motion is on the whole infinite. This means that a potential 
barrier of finite width cannot retain a particle in a well infinitely. 

26 * 
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Let us show how to calculate the transmission coefficient of a 
potential barrier, that is, the probability of a particle passing 
through it. 


The WaveJFunction in the Neighbourhood of the Turning Points. 
In matching the wave function in the regions where p 2 > 0 and 
where p 2 < 0 we encounter the following difficulty: the solution (31.8) 
is not valid in the neighbourhood of the two turning points ( x x 
and x 2 ). It is valid only at a sufficiently great distance from these 
points, on either side of both. We can, however, assume that up to 
some points, distant from x 1 and x 2 , the potential energy curve is 
approximated by its tangents through the turning points though, 
of course, the domains of such linear dependence of U on x are still 
very small in comparison with the whole domain below the barrier. 
We write the stated expansion of the potential energy as follows: 

U (x) = U (x—Xi + xJzaU (x,) + (x —xj) 

= U (^i) — (x — x t ) (31.9) 

where F i is the force acting on the particle at point Xi, U (x x ) = E • 

We substitute the expansion (31.9) into the Schrodinger equation 
to get 

—sSr3-=(£-C/)^ = (x-x 1 )F l1 J) (31.10) 

At a sufficient distance from the turning point we can make use 
of thefquasi-classical approximation (31.8) and represent the solu¬ 
tion in the form 

X t 

<*-*■ >"’) 

(31.11) 

Although the integral involves the lower limit x ly close to x = Xi 
the solution (31.11) is not valid. At the turning point x = x x in 
Figure 35, the derivative (dU/dx) x is positive. Here, the force of 
attraction F x = —(dU/dx)i < 0, and therefore to the left of point x x 
the radicand is positive. To the right of the turning point the radicand 
is imaginary. Here the solution has the form 

’*’ 4 - [(,_„) ,.,■/< “ p (=f i < 2 “ I F , I)‘ ,! <*■- 1 '> s ' 2 ) 

(31.12) 



Quantum mechanics 


405 


Both solutions (31.11) and (31.12) are valid only where the quan¬ 
tities in the exponents are sufficiently large in magnitude in compar¬ 
ison with unity. But from (31.11) and (31.12) it is not apparent 
how the solution passes through the intermediate domain, thatjis, 
the form a solution far to the left of x t (say, ty + ) assumes when x 
varies to values far to the right of x l9 

In order to link together two asymptotic solutions we have to 
know the exact solution for the intermediate domain and extend it 
to domains lying far away from the turning point. If on the left the 
exact solution takes, say, the form on the right we obtain a linear 
combination of \|) + and t|)“. But these are asymptotic forms of one 
and the same solution, and they should consequently be used to 
match the wave functions on both sides of the turning point. 

Let us now show how the exact solution is developed. Equa¬ 
tion (31.10) always allows for an exact solution, since it involves 
the independent variable linearly. Before we proceed to find the 
solution, let us make the following substitution: 


/ 2 mFi \ 1/3 

\ ) 


(x — Xi) 


(31.13) 


so as to get rid of extra letters cluttering up the equation. It can 
then be rewritten as follows: 

4j|-+5*=0 (31.14) 

The solution is conveniently sought in the form 

\|>= j e i ^f(q)dq (31.15) 


where the integration limits are not stated for the time being. We 
substitute (31.15) into (31.14), performing the following operations: 

f " = — j e i,? ? 2 / (?) dq 

It = j I * i9l f (?) dq = — t j (e^t) f (q)dq 

= — ieWf (?) | + i j dq 

We select the integration limits such that the integrated expres¬ 
sion vanishes. Then substitution into the differential equation (31.14) 
yields 


J ei9ld ? s=s0 


(31.16) 
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Since this equation must hold at all values of the function / 
satisfies the equation 


M. 

1 

to 

II 

o 

(31.17) 

so that 


f=z exp j q 2 dq'j =e iq3/3 

(31.18) 

whence 


\|)= f ei(9£+<z 3 /3) dq 

(31.19) 


This solution holds for any g’s, positive as well as negative. By 
carrying out certain computations based on the theory of functions 
of a complex variable, we can show that for large absolute values 
of | | | the obtained solution passes into a solution of the form (31.11) 
and (31.12), between which there is the following correspondence: 
solution to the left of x x : | solution to the right of x x : 

~gIT 3in (t £ 3/2 + T) c_2|S|3/ * /a (31-200) 

+ (31.206) 

Substituting x for £, we see that the solutions on the left are 
formed by linear combinations of i|? + , expressed according to 
(31.11). In turn, the exponents in (31.11) at even greater distances 
from the turning points of rotation should be replaced by the [in¬ 
tegrals ^p dx and j | p | dx. 

Penetration of the Potential Barrier. In Exercise 2, Section 28, 
we obtained the exact formula for the probability of a particle 
tunneling a rectangular potential barrier. We shall now find the 
formula for the probability of tunneling a potential barrier of ar¬ 
bitrary configuration in the quasi-classical approximation. 

The possibility of passing potential barriers is a general property 
of motion in quantum mechanics, associated with the fact that the 
wave function does not vanish in the classically inaccessible region, 
where E <j U. In the most general case, to determine the probability 
of passing below the barrier, it is necessary to solve the Schrodinger 
equation for the given problem. Accordingly, a general exact prob¬ 
ability formula cannot be developed. In the quasi-classical approxi¬ 
mation, however, a general formula can be obtained. We shall now 
proceed with its deduction. 

Since all the exponents in this approximation are large in compari¬ 
son with unity, we may conclude that the probability of penetrating 
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the barrier must be exponentially small, otherwise the very approxi¬ 
mation is inapplicable. But if the probability of penetrating the 
barrier is very small, the probability of reflection from it is close 
to unity. Hence, to the left of the barrier, from where the particles 
approach it, the wave function can, to a high degree of accuracy, 
be replaced by a standing wave of the form (28.5): 

X x 

pi/2^— exp (--\ j pd*--| L )+exp(_L J + 

Xi Xi 

= 2 S in (1 j pdx + ±) 

X 

We introduce ijt/ 4 so as to facilitate the transition through the left 
turning point Making use of the correspondence expressed by 
(31.20a), we see that the wave function below the barrier has the 
required form of a damped exponential: 

|p| 1/2 i|5 = exp (—y J \p\dx} 

To pass through the second rotation point x 2 , we represent the 
function below the barrier as follows: 

|p| 1/2 Tj) = exp (— y jj |p|dx) xexp(-|- j |p|<te) (31.21) 

X t X 

Here, the first factor is a constant quantity, while the second in¬ 
creases exponentially into the barrier. A solution of this form is 
matched with a wave travelling from the barrier to the right and 
having a coefficient equal to modulus unity. 

It follows from all that has been said that the amplitude 
of a wave receding beyond the barrier decreases in the ratio 

exp[—(1 /h) \ X2 |p | dx \, while the probability of penetrating the 
J ^1 

barrier is equal to the square of the decrease in amplitude: 

£> = exp(-|- j \p\dx) (31.22) 

This formula 9 can be used only as long as the exponent in it is 
large in comparison with unity. Otherwise, to develop the correspond¬ 
ing relationship it is necessary to use exact wave functions. 

9 Rigorous proof that the factor multiplying the exponential is unity is 
extremely involved. We have limited ourselves to a simplified derivation of 
(31.22). 
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The existence of tunneling shows that the concept of path is 
sometimes totally inapplicable to quantum motion. A path continued 
below the barrier would lead to imaginary values of the velocity. 

The uncertainty relations (22.4a) indicate only the lower limit 
of possible inaccuracies in stating the coordinates and momenta. 
When a particle is located below the barrier, the inaccuracies 
increase greatly. Wherever it is located below the barrier, its velocity 
is an imaginary quantity, that is, it is completely indeterminate. 

The same can be stated differently. In a change in the energy of 
a particle located in the domain below the barrier, the inaccuracy 
in the energyjvalue is so great that we can no longer assert that the 
particle possesses energy there. 

Alpha-Decay. Penetration of a potential barrier enables us to 
explain one of the most important facts of nuclear physics, alpha- 
decay. The nuclear masses of heavy elements with atomic numbers 
greater than that of lead satisfy an inequality of the form (14.9): 

m (A, Z) > m (A — 4, Z — 2) -f m (4, 2) 

Here A is the atomic weight and Z is the nuclear number. Thus, 
m (4, 2) is the mass of a helium nucleus with atomic weight 4 and 
atomic number 2. Such a nucleus emitted in alpha-decay is called 
an alpha-particle. 

All that can be seen from the inequality is that the spontaneous 
decay of a nucleus of mass m (A, Z) is possible, though no indication 
is obtained about the time law of disintegration. The nuclei of 
certain elements have mean decay times of 10 10 years while others 
have decay times of about 10~ 6 s, which is a difference of 23 orders 
of magnitude. It will be noted that the energy of the alpha-particles 
emitted differs here by a factor of only two. From experiment it turns 
out that the logarithm of the mean decay time of a nucleus is in¬ 
versely proportional to the alpha-particle velocity. It is this loga¬ 
rithmic law that corresponds to the difference of 23 magnitudes. 
It is accounted for by the difference of barrier factors which depend 
exponentially upon the energy. 

The problem is to develop a suitable, and as far as possible simple, 
model of a nucleus to which the observed law of decay, based on 
quantum mechanical laws, would correspond. Such a model should 
not, furthermore, contradict other facts regarding the nucleus. 

Although at present there is no quantitative theory of nuclear 
interactions, there exists a quite satisfactory, sufficiently universal 
model of a nucleus, which makes it possible to describe and reveal 
the interconnections of all observed phenomena in nuclear physics, 
up to particle energies of hundreds of MeV. 

In the case of alpha-decay, the most general aspects of this model 
are sufficient. At large distances from the nucleus, exceeding 10" 12 cm, 
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no specific nuclear interactions manifest themselves. At such dis¬ 
tances an alpha-particle is subject only to the action of the electro¬ 
static Coulomb force, to which corresponds the potential energy 

U= 2(Z ~ 2)g2 (31.23) 

At small distances, attractive forces must act, of course, since 
otherwise the nucleus (A, Z) could not exist at all. We do not know 
the force law, that is, the shape of the potential-energy curve close 
to the nucleus, but we can assert that it should correspond to a very 



steep dependence of the forces on distance, as in all nuclear interac¬ 
tions. It was pointed out in Section 28 that simplified models known 
as potential wells are quite sufficient for the description of such 
potential curves. 

Figure 36 presents such an assumed potential well in which an 
alpha-particle is located in the nucleus. Farther from the nucleus 
the curve corresponding to the well transforms in some way into 
a Coulomb repulsion curve; the finer details of the transformation 
law, as will be apparent from subsequent computations, are imma¬ 
terial. The energy level E is plotted above zero, since otherwise the 
alpha-particle would simply be incapable of flying out of the well. 
The simple curve in Figure 36 is quite adequate to explain the law 
of alpha-decay. 

To determine the probability of alpha-decay it is sufficient to 
calculate the barrier factor D from (31.22). Since nuclear forces are 
short-range, the transition region between electrostatic and nuclear 
forces is small, and the Coulomb law can be considered to hold up. 
to the point r = r l9 that is, up to the vertical boundary of the well. 
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Point /*! is the effective radius of the nucleus, determined from alpha- 
decay. (Other nuclear data lead to somewhat different values of its 
radius, but the differences have been found to be fully within ac¬ 
ceptable limits.) 

Thus, we compute the barrier factors from the formula 

d =<“p { -r J [ 2 “(-^Li- e)]'"*•} «,-<• 

(31.24) 

The integral in the exponent can easily be calculated by the 
substitution 


tE 


. —- r.n c 2 


COS* 2? 


2 ( Z — 2 ) <? 2 

Then after elementary treatment it is reduced to 
P = ± L(2ro) 1/2 j| ( 2 {Z .Z- 2)e2 — ff) 1 


,1/2 


= 2.(2m) 1/2 2(Z Ei % e * [arc cos ( 


dr 

Er\ 


■) 


1/2 


(31.25) 


2(Z->-2) i 2 

<*•*> 


The quantity Er x [2 (Z — 2) a 2 ]" 1 is the ratio of the energy of the 
alpha-particle to the effective barrier height at point r x , taken 
according to Eq. (31.23). Let us evaluate this ratio. For heavy 
nuclei, 2 (Z — 2) « 180, r x « 9 X 10" 13 cm, and we take E equal 
to 6 MeV (10" B erg), e 2 « 23 X 10" 20 esu 2 . From this 


Er t 

2 (Z — 2) e* 


5 


This quantity can be considered small. Then in the right-hand 
side of (31.26) we obtain, approximately, 



1/2 2 (Z — 2) e 2 
E il2 


[ n _ 9 ( Eri 

l 2 Z \ 2 (Z — 2) e 2 



»-^-2 (Z-2)-| [mr t *(Z- 2)\M (31.27) 


The validity of this expansion is easily checked by direct substitu¬ 
tion. 

Knowing the barrier factor, it is simple to find the time law of 
alpha-decay. For that we make use of Eq. (23.18). The surface 
.integral in the right-hand side of this equation should be referred 
to an infinitely remote surface, insofar as outside the nucleus all 
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alpha-particles are receding from it. The total flux across such a sur¬ 
face yields the probability of decay in unit time. The space integral 
in the left-hand side of (23.18) need be extended only over the volume 
of the nucleus, since the wave function of an alpha-particle falls off 
exponentially below the barrier. According to the principal result 
of Section 23, the probability of finding an alpha-particle in the 
nucleus is 


N = j 11)5 I 2 dV 


We can assume that at some initial time N was equal to unity. 
Then the law according to which it decreases with time can be de¬ 
termined from Eq. (23.18). The amplitude of the wave function 
decreases Z3 1/a times in penetrating the barrier. If the amplitude 
is assumed to be unity in the nucleus, as we did in calculating the 
barrier factor, then the wave function at infinity should be 

_ BD il2 e ipr/h 

We have made use of the fact that, to pass from one-dimensional 
to three-dimensional motion, the wave function must be divided 
by r (see Sec. 29). The factor (4 jt)“ 1/2 appears in normalizing the 
wave function to unity over the solid angle, the coefficient B is 
associated with normalization to N over the volume of the nucleus. 

Substituting the expression (31.28) into the right-hand side of 
(23.18), which expresses the law of conservation of the number of 
particles in nuclear decay, and replacing the normalization coef¬ 
ficient B by the probability of finding the alpha-particle in the 
nucleus, we can obtain the following law of alpha-decay: 


(31.28) 


dN 

dt 



r _ 4 Ejh _ 1 _ 

mT \P [2 (Z — 2 ) e 2 /(Eri) — 1] 1/2 




(31.29) 


where E t is the distance from the E energy level to the “bottom” 
of the potential well in Figure 36; E t is evaluated extremely roughly. 
But the main significance of Eq. (31.29) lies not in the pre-exponen¬ 
tial factor, but in the exponent of the exponential, which gives the 
fundamental dependence of the decay probability upon the energy 
of the alpha-particle. Taking this into account, and bearing in mind 
the arbitrariness in the definition of E t , we did not reproduce the 
detailed development of the coefficient multiplying the barrier 
factor in (31.29). This coefficient is entirely different if it is assumed 
that an alpha-particle does not exist in the nucleus as an entity and 
forms only at the moment of emission, which is probably closer to 
the truth. But with this assumption, too, the barrier factor remains 
the same. 
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Integrating (31.29), we obtain an exponential law for decreasing 
decay activity: 

N = e-™/h (31.30) 

Here, we put N (0) = 1. The quantity has the dimensions of energy 
for the sake of convenience in comparing it with other quantities 
having such dimensions. Every nucleus has the same decay probabili¬ 
ty per unit time regardless of how long it has existed without dis¬ 
integrating. This probability is T/h, and it is independent of time. 

Equations (31.27) and (31.29) confirm the law of inverse propor¬ 
tionality between the logarithm of the probability of alpha-decay 
and the experimentally determined velocity v of an emitted a-parti- 
cle. A simple computation shows that, for a twofold change in the 
energy of an a-particle, from 4 to 8 MeV, the disintegration time 
changes by 22 orders of magnitude. The laws of classical mechanics 
are quite incapable of explaining such a strong energy dependence 
of the decay time. But the cause of this dependence lies precisely 
in the quasi-classical motion of an a-particle: by classical laws it 
cannot leave the nucleus at all, while thanks to the finite value of 
the action quantum h there appears a very low probability, which 
decreases sharply together with the energy of the alpha-particle. 

The Width of a Level. Using the expression (31.30), let us write 
in explicit form the time dependence of the wave function of a 
nucleus that has not emitted an alpha-particle. This dependence has 
the form 


•tyoce-wuve-ww 1 (31.31) 

• 

The first factor accounts for the exponential fall-off of amplitude 
according to g-r*/( 2 /i) j aw ( s i nce the probability, or the square of the 
amplitude, diminishes according to e~ Tilh )\ the second factor is the 
usual wave-function time factor. Expression (31.31) is very similar 
to the well-known formula for damped oscillations, with the differ¬ 
ence that in the given case it is the probability amplitude of the initial 
(not yet decayed) state of the nucleus that is damped. 

For this state there exists a finite probability flux for the emission 
of a particle from the nucleus, which is proportional to the proba¬ 
bility of an alpha-particle being in the nucleus. It is this that leads 
to the exponential fall-off of the probability of the state prior to 
decay. 

All nuclei before decay are described by exactly the same wave 
function (31.31), if at the instant t = 0 they were in the initial 
state. Therefore, they all have a perfectly identical probability of 
decaying in unit time, and it is impossible to predict which one of 
them will decay earlier and which later. In exactly the same way, 
in the diffraction experiment it is impossible to say which part of the 
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photographic plate will be hit by a given electron. The decay law] is 
purely statistical, in the same way as the law for diffraction patterns. 

Alpha-decay cannot be treated as an end result of some temporal 
process inside a nucleus: a nucleus is always, to exactly the same 
extent, ready for a decay process. This is indicated by the fixed form 
of the decay law. 

For this reason time can, in principle, be equally measured by 
periodic or radioactive processes. Actually, the law of both processes 
is exponential: in one case it is with a real exponent, in the other, 
with an imaginary one. 

The wave function (31.31) can be ascribed to the complex energy 
eigenvalue E c = E — iT/2. This eigenvalue does not contradict 
the Hermiticity of the Hamiltonian. We noted in Section 26 that any 
differential operator is defined only after the boundary conditions 
imposed on the wave function are stated. Up till now we chose these 
conditions in real form: for example, for the case of finite motion the 
wave function was supposed to be zero in infinity. For the case of 
alpha-decay, we have a different boundary condition: at infinity 
the wave function is of the complex form (31.28) corresponding to 
a diverging spherical wave. The eigenvalue is complex because the 
wave function is complex. 

Suppose now that the wave function (31.31) is expanded in a 
Fourier integral over real-energy functions whose time dependence 
is determined by the exponentials e~ iE ' t/h . What is the order of 
magnitude of the energy interval in which the amplitudes of the 
real-energy functions are other than zero? 

We write the Fourier-integral expansion 

oo 

e -iEctih = e -rtH 2 h)-iEt/h = j a (E')e- iE,t l h dE' (31.32) 

— oo 

Then the amplitude a (E') is 

oo 

a (£•')=_!_ j d te -Ttl h -HE-E’)tlh. 

0 

_ 1 
~~ 2n [r /2 + i (E — E')] 

and the square of the amplitude is 

I a i E ) l 2 = 4 n 2 [(£—£-)2+r 2 /4] 

We see from this that | a (E') | a decreases by half when E' is at 
a distance ±172 from E, so that the whole “half-width” of the energy 
interval is equal to F (Figure 37). 


(31.33) 

(31.34) 
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It can be seen from definition (31.32) that the dimensions of the 
amplitude a ( E ') are (energy)” 1 . Therefore, if we integrate the square 
of the amplitude over energy we obtain a quantity whose dimensions 
are also (energy)” 1 . But this is the total area of the curve (31.34) 
characterizing the energy interval in which the amplitudes of the 



expansion are other than zero. Denoting this interval A E 9 we obtain, 
by definition 

oo oo 

~Ke f \ a i E )\ 2 dE = "4^r j (£—£')*+r*/4 = ~2nT (3!-35) 

— oo — oo 

Onjthe other hand, the mean decay time can be defined as 

oo 

At = J e~ Ttlh t dt = -^- (31.36) 

0 

Comparing (31.35) and (31.36) yields 

A E to = 2nh (31.37) 

The obtained relation is similar in form to the uncertainty rela¬ 
tion (22.4a). Note that the factor 2n in the right-hand side of (31.37) 
is linked with the definition of A E through Eq. (31.35). If AE were 
determined as in (19.66), according to the curve | a ( E ') |, in the 
right-hand side of (31.37) we would have unity. 

The relation (31.37) should be formulated as follows: the energy 
of a state existing for a limited time to is determinate to the accuracy 
of a quantity of the order 2nh/to. Only the energy of a state of in¬ 
finite duration is defined exactly. 
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The meaning of the uncertainty relation for position and momen¬ 
tum is not analogous to the meaning of (31.37). The evaluation 
(22.4a) expresses the fact that position and momentum do not exist 
in the same state; (31.37) means that if the state of a system is of 
finite duration, A£, then its energy at every instant belonging to A t 
is not defined precisely and lies within some interval of values, A E, 
of the order 2jife/Af. 

The quantity T ~ A E is the width of the energy level of the system. 
The concept of level width can be applied to any state of finite dura¬ 
tion, not only to the states of systems capable of alpha-decay. For 
example, the energy level of an atom in an excited state has a 
definite nonzero width, since an excited atom is capable of 
spontaneous emission of a quantum. 

Explanation of the Level Width. We shall now show how the level 
width of a nucleus capable of alpha-decay can beifound by considering 
the wave function variation below a potential barrier. 

It was shown in Section 28 that infinite motion has a continuous 
energy spectrum. The motion of a particle penetrating a potential 
barrier is infinite because it is capable of going to infinity. It follows, 
strictly speaking, that a nucleus capable of alpha-decayfshould have 
a continuous energy spectrum. Actually, we made use of this in the 
expansion (31.32). 

Let us now see how the quasi-discrete levels E c are found. From 
(31.32) we see that even for a nucleus with a very short alpha-decay 
time (t ~ 10" 6 s), T ~ 10“ 22 erg or 0.6 X 10” 10 eV. How is it 
possible to combine a continuous spectrum with such a narrow 
energy interval? 

The general solution to the wave equation between points r and r ± 
is of the following form: 

*=7^172- ex p(-|{ \p\ dx )+jffa ex v(i 5 |p|dx ) 

r i »*i 

(31.38) 

The first term exponentially decreases with r, while the second 
exponentially increases. It follows that if the barrier were extended 
to infinity rightwards, a solution would exist only for C 2 = 0. The 
ratio C 2 /C 1? determined from the boundary conditions at r = r x , 
is a function of energy. It is the roots of equation C 2 ( E) = 0 that give 
the possible energy eigenvalues for finite motion. The energy of a 
particle in a well of finite depth is obtained in just this way. 

Motion of a particle in a well, which was considered in Sec. 28, 
differs from motion of a particle beyond the barrier because the 
barrier is of finite width. Therefore, in this case the second solution. 
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proportional to C 2 , need not be strictly equal to zero and may be 
just small compared with the first solution in any small interval of 
values, A E, close to a root of equation C 2 ( E) = 0. This region of 
values, A E, is what corresponds to the assumption that the modulus 
of the wave function outside the nucleus is small compared with the 
wave function inside the nucleus. 

In other words, if the energy of the nucleus is contained in a given 
region of values A E, we can say that the alpha-particle is in a sense 
bound in the nucleus, that is, with overwhelming probability found 
within it over a certain finite time interval A£. 

The higher or wider the potential barrier, the less the barrier 
factor D and the less the decay probability T/h proportional to it. 
But then A E is also correspondingly reduced, that is, the continuous- 
spectrum state becomes closer to the discrete-spectrum state with an 
exact energy value E . This is what explains the meaning of the un¬ 
certainty A E: it indicates how close the state is to a bound one, 
with an infinite lifetime. 

We can also say that as the barrier width increases, that is, as the 
barrier factor decreases, the imaginary part of the energy, equal to 
—172, tends to zero, while E c tends to the bound-state energy. 

The uncertainty A# in no way limits the applicability of the 
energy conservation law: the total energy of a nucleus and an alpha- 
particle is constant. But a state with a strictly defined energy cannot 
refer to an alpha-particle in a nucleus, since if its coordinate is 
given in an interval close to r 1? the energy can no longer have an 
exact value. As was pointed out before, this case is quite unlike free 
motion, therefore the uncertainty in energy should be calculated 
with the help of the decay probability, not simply from the rela¬ 
tion (22.4a). If the energy is stated precisely, it refers to both an 
undecayed nucleus and a decayed nucleus. Superimposed, these states 
of the nucleus yield a general state with an exact energy value. 

Any state capable of spontaneous transition to another state with 
the same energy possesses a certain energy width. A precisely defined 
energy always corresponds to a superposition of states capable of 
transition from one to the other. 

We can divide the total level width into partial widths related to 
the probabilities for various transitions. Thus, strongly excited nu¬ 
clear states are capable of emitting neutrons of various energies 
and of radiating gamma quanta. Each possibility contributes its 
exponential in the term characterizing attenuation of the wave 
function (31.31). The total attenuation is determined by the product 
of such exponentials. It follows that the total level width is equal 
to the sum of its widths in relation to all decay possibilities. 

The Bohr Quantum Conditions. The quasi-classical approxima¬ 
tion makes it possible to determine the energy levels of a particle 
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moving in a potential well. To avoid repetition of similar diagrams, 
we shall assume that in Figure 35 the bulge of the potential energy 
curve faces downwards. Then, between points x t and x 2 the total 
energy is greater than the potential energy, so that this is a classically 
possible domain of motion. According to the laws of classical mechan¬ 
ics, the particle performs periodic motion similar to a pendulum 
(cf. Fig. 8). The total period of the motion is 

t — 2 f — = 2(2m) 1/z [ - ** .... (31.39) 

J V v ’ J [E-U (i)] 1/2 V / 

Xi SC, 

Let us now determine the restrictions imposed by quantum me¬ 
chanics on the possible energy eigenvalues of the particle. Since to 
the right of point x = x 2 the potential energy is greater than the 
total energy, the solution of the wave equation must be represented 
in the form of an attenuating exponential: 

X 

*2 

Using the conditions for matching, (31.20), we should pass from 
this solution to the solution inside the well. We represent it as 

^ = —yjt CX P ( ) P dx + ^) (31.40a) 

since for x close to x 2 it transforms into (31.20a). 

To the left of point x = x x the solution should again have the 
form of attenuating exponential: 

*-2T7ii7r“ p (-4i 

X, 

If we apply the conditions for matching to it, the function in the 
region of the well will be expressed as follows: 

X 

'l ,= - 7Ur sin (r J Pte + T) (31.406) 

The functions (31.40a) and (31.406) must, apparently, coincide 
for any value of x within the well, since they refer to the same state: 

X 2 x 

sin j pdr + -^-) =Csin (-i £ pdx + ^-) (31.41) 


27-0452 
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For Eq. (31.41) to be valid, one of two conditions must hold: 

(i) C = 1, the sum of the phases under the sines is an even number 
multiplied by Jt. 

(ii) C = —1, the sum of the phases is an odd number multiplied 
by ji. 

Indeed, in case (i), denoting the sine phases a and 2kn — a, we 
obtain 


sin a — sin (2/tji — a) = —2 sin kn cos (a — kn) = 0 

In case (ii) we have, similarly 

sina-f sin \(2k + 1) ji — a] 

= 2 cos (* + -§■) 71 sin [a — (& + 4 ) 11 ] = P 

Combining both cases in one formula, we see that functions (31.40a) 
and (31.406) coincide if the sum of the phases under the sine is equal 
to an integral of Jt which, evidently, is no less than ji, because the 
sum of the phases is positive: 

X 2 X 

-i j pdx + i + JL j pdx + ^-=n(n + i) 

X Xt 

where n = 0, 1, 2, etc. Now, extending the integral over the whole 
period of motion in the classical sense, we arrive at the condition 

x 2 x 2 

2 J [2m (E n — C/)] 1 / 2 dx = 2 j p n dx = 2n (« + y) h (31.42) 

Xi X t 

A similar condition was postulated in Bohr’s old theory, but 
then simply n was written. Note that in contemporary quantum 
mechanics the terms ji/ 4 in the cosine phases appear only when the 
potential energy curves have finite slopes at the turning points. 
If that is not the case, further investigation of the question of the 
correct choice of phases is required. In general, the quasi-classical 
approximation is applicable only when the phases involved in the 
wave function are large in comparison with 2 ji. Formula (31.42) 
is therefore valid only for large ns. There are, however, exceptional 
cases when it is applicable for all n's. 

As distinct from the old quantum theory, quantum mechanics 
obtains the approximate formula (31.42) without any additional, 
extraneous, assumptions, whereas formerly such a formula was 
applied to classical motion. But in classical mechanics it is always 
assumed that mechanical quantities may vary continuously, so that 
the quantum postulate appears sharply to contradict all its prin¬ 
ciples. 
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As can be seen from (31.42), the phase of the wave function on the 
segment fo, x 2 ) varies by a number greater than nn and smaller than 
n(n + 1). It follows that the sine changes its sign n times, that is, 
the function has n zeros, which corresponds to the nth. energy eigen¬ 
value. The eigenvalues E n increase together with the number n, 

X2 

(E n — £/) 1/2 dx y 

that is, the area of the curve, f ** p dx , lying below the line E n = 

%} ®i 

= constant, increase. Thus, the general rule is confirmed: the greater 
the energy eigenvalue the greater the number of zeros in the corres¬ 
ponding wave function. 

Let us find the interval A E between two neighbouring energy 
eigenvalues for large az’s. We have 

2 [ (2m)1/ . /2 dx = 2nh (31.43) 

J [En-Ulx)]" 2 K 

But the integral here is expressed in terms of the classical motion 
period x in such a way that 

A£’ = i^- = feco (31.44) 

where co is the frequency of classical motion. Thus, in the first 
approximation successive energy levels are equidistant, if we can 
neglect the dependence of the oscillation frequency on the energy. 

The methods of finding the eigenvalues of the Hamiltonian described 
here can also be applied to other operators, for example, to the 
angular momentum square. For that the equation for the wave 
function of such an operator must be reducible to the form 

r + p 2 (*, k)f = o 

where X is the eigenvalue of the investigated operator. 

The Quasi-Classical Limit of Matrix Elements. Let us now see what 
the matrix elements of operators transform into in the quasi-classical 
approximation. For a limiting process from the wave laws of motion 
of quantum mechanics to classical motion along paths, we must form 
wave packets from the wave functions: 

j dEC(E) yp(E, x), ^*= J dE' C*{E’) H>*(£\ x) 

(31.45) 

Since actually the classical path is considerably smeared in com¬ 
parison with quantities on the microscopic scale, the energy interval, 
A E, over which the integration is performed in (31.45) is very small. 

27* 
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Let there be a certain operator k (x) from which we must develop 
the average value over the state describing the wave packet (31.45). 
In accordance with the general formula (25.19), we write 

(k(t)) = J 

= J dEC{E) 

We take advantage of the fact that the whole energy interval in 
the quasi-classical limit is very small and replace E' — E by the 
quantity A E to get 

<Jt(*)> = J y dEdAEC(E)C*(E + AE) e'W'X E+AE h 

In C*(E -f A E) we can simply replace A E by 0. Then 
C(E) C*(E + A E) transforms into | C(E) | 2 , that is, the pro¬ 
bability of the state per unit energy. In the matrix element k E ’E 
such a substitution would mean passing to a diagonal element, 
which in general does not depend upon time, whereas we are inter¬ 
ested in (k (t)) as a function of t. 

The matrix element k E +& E e depends on the energy E smoothly. 
It is possible to substitute into it some mean value of the energy E 0 
in the interval A E. Actually, (k) = (k ( E 0 , t )) also depends upon E 0 . 
Then the required mean value is expressed as follows: 

a (£o, <)> = J dE\C(E) I 2 J dAEe i ^ Et/h \E 0 +\E e b (31.47) 

But the integral of all the probabilities, y dE | C(E) | 2 , is 
unity, and we come to the following: 

a (E 0 , t)) = J dAEe* Et !1k Eo+ t EEo (31.48) 

If we write the expansion of the classical quantity k(E n , t) in 
a Fourier integral, a simple comparison of (31.48) and the expansion 

l(E 0 /f)= J dae'°n(E 0 ; Ja) 

shows that 

U+AE!E„= h-H (E 0 , j AE/h) r (31.49) 

Thus, the matrix element transforms into a coefficient of the 
Fourier expansion corresponding to the frequency with which the 
matrix itself varies with time. In the case of a discrete spectrum, 
we should employ not the integral but a Fourier series, the principal 
interval of the expansion being the period of classical motion, x, de¬ 
termined from (31.39). 


j dE 9 C*(E') eW- E w*k E ' E (31.46) 
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Let us now apply the obtained result to show that the commutator 
between the Hamiltonian and the operator k in the classical limit 
passes into the Poisson bracket for the corresponding functions 
$£ (x, p) and \ (x, p) (see Sec. 27). 

Let the operators Sf and X be in coordinate representation. Then 
their commutator looks like this: 

= j dx" [$£ (x, x") X (x’\ x’) — X (x, x") m (x", x')] 

Every function of two variables can be identically represented 
as a function of the sum of, and difference between, those variables: 

$£(x, x") = se , x-x") 

X (x", x') = X^ - , x" — x' j 

$£{x", x') = ^(^4^, x"-x') 

We expand the functions of the difference in the right-hand sides 
of these identities in Fourier integrals in terms of the normalized 
momentum eigenfunctions, ip = (2 ji/*)~ 1/2 e ipx l h y to get 

m (*• 2 ") = -^ J e mx ~ x " ),h 3i (^- , P)dp 

1 x>) = ^h $ ^--x'yhx (^l±£l , Pl ) d Pi 

In the classical limit, as was pointed out before, we must pass 
to wave packets and assume the differences between the cooidinates 

( T I." \ 

-- ,P 1 , 

- 2 —) and the other Fourier amplitudes in power series 

in a: — x ", x' — x', . . ., and restrict ourselves to linear terms: 

»(x, X-) = -ji- j «<»*» (M(x,p)+±Hx±$((x,p))dp 

M*', Mx\ P,)) dp, 

where Ax = r — x”, and Ax" = x" — x'. Next we transform the 
linear terms by parts. For this we replace Axe ij,Ax ^ h by (h/i) X 
X(d/dp) e ip **f h and take advantage of the fact that at the limits 
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(for p = ± 00 ) the expansion coefficients must vanish. Hence 

«<*• *’>-•& J («<*• ^ 

*<*•'*')- J77^) d Pt 

(The signs of the second derivatives are opposite because the powers 
of the exponentials involve, respectively, x — x" and x" — x'.) 

We substitute the expansions of the matrix elements into the 
commutator for the operators and make use of the fact that changing 
the order of integration with respect to z", p, and p x yields a 6 func¬ 
tion: 

e i,'(P 1 -P)/^ x '' = 6(p 1 -p) 

Substituting the integral over p t with the help of the formula 
j dPi 6 (Pi — P) f (Pi) = / ( P) 

and retaining only the terms linear with respect to the action quan¬ 
tum, we arrive at the following expression for the required matrix 
element: 


(fflk-k&'lxx’ 

= 2 ^ j 1 38 (x, p) l (x\ P)-W (x\ p) X ( X , p) 

h l pm . . , a . cja . . 

- 2r(^M x p ) 


+ Mx, p) 




d a \ 


dx 1 Op dx op 


dx' dp 
M{x\ P))} 


It is significant that f and f are now involved with the 

0 OX Op OX Op 

same signs, since the exponentials have, respectively, x — x” and 
x" — x '. We again transform by parts the terms proportional to h. 
As an example, let us consider one of them: 


i 


e i p.(.x-x’)/h 


&&e 


= -] (t(*-*’> 




Passing to the limit, we should assume that x' -+x everywhere 
except, of course, the exponential (*-*')/& with respect to which 
the expansion is performed. Then 
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Collecting all terms linear with respect to h , we obtain the commu¬ 
tator in the classical limit: 


— = J dp e ip ix - x ' )/h * 

h_ t d&e dX 
i \ dp dx 


S£) ,31.50, 


We now expand (31.50) in a Fourier integral in the same way as 
we expanded M and k to get 

= ^ j dpg>pt*—-v h (M — i&) xP (31.51) 

Here, we can directly assume the expression in parentheses to be 
a function of x and p. Comparing the coefficients of the Fourier ex¬ 
pansion (31.51) with (31.50), we see that on the one hand we have 
the classical limit of a matrix element of the commutator (${,X — 
and on the other we have the Poisson bracket: 


v i / cir 'K \ cir\ dX d&fi dX 

lim — (<2?6k — AJ>£)= — -~--r- 

- - * v ' dp dx dx dp 


[A—0 


(31.52) 


as was pointed out in Section 27. Note, that the next term in the 
expansion of the commutator already involves h 2 . 


EXERCISES 

1. Determine the energy levels ol a linear harmonic oscillator from 
Eq. (31.42). 

Solution. 

(2E/mco2)l/2 

-(2E/mo)2)l/2 

From this, E n = hco (n + 1/2), which is in general valid for all n' s, from 
zero on. Note that in a formula suitable only for large n's the term 1/2 is 
meaningless. 

2. Determine the factor D for a potential barrier of the form: U = 0 
for x < 0, U = U 0 — ax for x ^ 0, and E < U 0 . 

3. Compare the accidental energy degeneracy in the hydrogen-atom 
problem with the expression for energy in terms of the adiabatic invariants 
in Kepler’s problem (Exercise 3 f Section 10). 
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PERTURBATION THEORY 

The Hamiltonian of a quantum mechanical system frequently in¬ 
volves a term multiplied by a small parameter. If the exact wave 
equation does not admit of an analytical solution, or if it is very 
complicated, it is useful to seek a solution in the form of an expansion 
in powers of the small parameter. 

The problem can be posed in two ways. Sometimes the small term 
in the Hamiltonian, which is called the perlurlalion, hut slightly 
affects the energy and the w'ave function of the unperturbed state, 
the perturbed and unpertuibed states both being stationary. 

In other cases the perturbation makes a stationary stale non¬ 
stationary, so that a finite probability appears of the system passing 
into an entirely different state, for example, from one with discrete 
energy eigenvalues to one with a continuous energy spectrum. 

These cases must be examined separately. 

Time Independent Perturbations. Nondegenerate Motion. We shall 
first examine the simplest case, when the energy eigenvalues of the 
Hamiltonian corresponding to the unperturbed state of a system 
are not degenerate. Denoting the unperturbed Hamiltonian <$? 0 , 
and the perturbation V, the total Hamiltonian is 


$£ = (P£o+V (32.1) 

The equation for the eigenfunctions of the unperturbed Hamil¬ 
tonian looks like this: 

(#o-£no)^nO = 0 (32.2) 

We assume that $f 0 has a discrete spectrum and that to each 
eigenvalue of E n0 there corresponds one eigenfunction these 

functions being orthogonal and normalized to unity: 

j ^*'0^n0^=6r>'n (32.3) 

The perturbation involves a certain small parameter, but w^e shall 
not write it in explicit form, remembering that the order of each 
expression is determined by its degree w r ith respect to V. 

We shall seek the correction to the eigenfunction and to the energy 
eigenvalue with number n. The corresponding exact equation for 
the energy eigenvalues is 

($fo + V-E n )q n =0 


(32.4) 
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The solution is conveniently sought in the form of an expansion in 
eigenfunctions of the unperturbed Hamiltonian, \f n0 , which are 
determined from (32.3): 

■<J>n = 2 (32.5) 

\n' 

Substituting this expansion into Eq. (32.4) and making use of the 
fact that \|v o are eigenfunctions of <^ 0 , we obtain 

2 Cn’ {E n *Q — E n + V) T|V0 = 0 

n' 

This equation is exact. It is readily transformed into an equation 
for the coefficients C n >, which corresponds to a transition to a repre¬ 
sentation in which the energy of the unperturbed system (32.2) is 
the independent variable. For this we premultiply the equation fortf 
by and integrate over the whole volume (we denote the volume 

a 

element dx so as to avoid confusion with the operator V ). Making use 
of the orthogonality of wave functions, we find 

( E m - E n ) C k = - 3 V hn 'C n . (32.6) 

Here V hTl denotes a matrix element of the perturbation. 

The relation (32.G) represents an infinite set of linear equations 
with respect to the expansion coefficients C n *. This set is solved by 
method of expansion in a power scries in the small parameter involved 
in F. We represent the energy of the / 2 th state in the form 

E n = E u q -f- E n i -f- E U 2 “f“ • • • (32.7) 

and the coefficients of the expansion in the eigenfunctions of the 
unpeiturbed motion as 

Eh = Sfen + C fcl + C ft2 + • • • (32.8) 

Now, in the left-hand side of (32.G) we put k = n; then in the first 
approximation there remains only one conection to the energy, E nl . 
In the right-hand side we must retain the first-order term with respect 
to y, or y nn . Hence, in the first approximation the equation for the 
correction to the energy of the unperturbed motion is 

Em = V nn (32.9) 

But the diagonal matrix element of the perturbation is 

V n n= j ll&otfynorf* (32.10) 

that is, in accordance with (25.19) the first-approximation correc¬ 
tion to the energy is equal to the mean value of the energy for the 
unperturbed state. 
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If k=£n, the difference E k0 — E n is of zero order with respect 
to the perturbation; it is also other than zero in the unperturbed 
system. Hence the whole left-hand side of Eq. (32.6) is of the first 
order, since it involves C h (Ic rc).To have the same order on the 
right as well, we must retain with Vhn ' only the coefficient with the 
number ri = n , in accordance with (32.8). From this we find C hi : 


Vhn 

EhO — ^nO 


(32.11) 


(we have replaced E n by E n0 in the denominator, since we already 
have V hn in the numerator). 

With the help of (32.5) we find the eigenfunction (in the first 
approximation): 


'l’n=^nO— 2 
h 


Eho — E u qi 


(32.12) 


where the primed summation sign denotes that the term with k = n 
has been discarded. 

From Eqs. (32.11) and (32.12), we can see the requirement the 
perturbation must satisfy to be considered small: 


| Ffen | | E h o — E n0 | 


(32.13) 


It should be small not with respect to the unperturbed energy 
eigenvalues, but with respect to their differences. The matrix 
element of the perturbation between the states k and n must be 
substantially smaller than the distance between the kth and nth 
energy levels. 

Often the mean value of the perturbation with respect to the 
unperturbed state vanishes. Then the next approximation should 
be used. We again substitute k = n into Eq. (32.6), but on the left 
relain the terms up to the second order. Taking into account that on 
the right we already have V nn ', we must retain the coefficients C n » 
up to the first order: 

Eni Em — Vnn + 2 Vnn'Cn'l 

n' 



V ,v , 

Tin' n'n 

E Q n ' Eon 


(32.14) 


The quantities E nl and V nn cancel out in accordance with (32.9)« 
leaving the expression for the correction to the energy of the nth 
state in the second approximation: 


En2=~ 2 ‘ 


l^nn' I 2 


"On' 


Eon 


(32.15) 
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Hero we made use of the fact that V nn ' = since V must be 

a Hermitian operator. 

Note that if n = 0, that is, the ground state is being perturbed, 
the correction to the energy E 0 must necessarily be negative. 

The Variation Property of Eigenvalues. This last result can be 
explained on the basis of very general considerations. We pointed 
out that finding the eigenvalues of an operator represents no more 
than a reduction to the principal axes of a second-order surface in 
Hilbert space. But the principal axes of a quadratic surface possess 
the variation property: in an infinitesimal rotation of the coordinate 
axes (close to the principal axes) the radius of a surface drawn from 
the origin of the coordinate system is stationary, that is (in the 
first approximation) it does not change. Consequently, if we speak 
of the zero, or ground, state, the corresponding principal axis is the 
smallest; any approach to this state leads only to a reduction in 
the energy. 

The eigenvectors of $£ 0 did not yield the actual principal axes 
of the operator Si = Sio + V. But when a correction to the wave 
functions, that is, the eigenvectors of Si, was introduced with the 
help of (32.12), the eigenvalues came closer to the actual ones. 
In particular, the ground-state energy was brought closer to its actual 
value and, consequently, it decreased. 

As for the correction to the first approximation, E nl = V nn , 
it can be of any sign. This correction is computed from the eigen¬ 
functions of the unperturbed state, without rotating the axes in 
Hilbert space. In effect it should simply be included in the energy 
of the unperturbed state. It determines the correction to the radius 
of the quadratic surface of the exact Hamiltonian Si = Si Q + V 
drawn in the direction of the rcth principal axis of the unperturbed 
Hamiltonian. It is precisely this quantity, corrected in the first 
approximation, that is further refined by a rotation of the axis ac¬ 
cording to the formula (32.12), and it is then found that in a more 
exact refinement of the ground-state energy it can receive only 
a negative correction, E 02 . 

The variation property of eigenvalues can be very simply proved 
in the following way. We seek the wave function \|), which gives the 
integral 

(£0=j y&eydx (32.16) 

its extremal value, with the additional condition 

1 = j dx 

that is, that it remains always normalized to unity. 


(32.17) 
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To solve the extremum problem with the additional condition, 
we should employ the method of undetermined multipliers: vary 
both integrals (32.1G) and (32.17) with respect to multiply the 
variation of the second integral by the as yet undeterminate quantity 
— E, add after this the variations of the integrals, and equate their 
sum to zero to get 

j —£)i|>da:=0 

For this equation to hold for any variation 6\|;* we must require 
that 

(# —£)\|) = 0 (32.18) 

But this is the equation for finding the eigenfunctions of S£. 
Hence, any eigenvalue corresponds to the stalionary value (S£), 
the ground state having the least possible value. 

This property of the ground-state energy is used in the following 
way. For example, we have a neutral atom and an electron and have 
to determine whether they can join, forming a negative ion (electrons 
can be captured by atoms of hydrogen and several other elements). 

It is extremely difficult to solve the problem of the motion of two 
electrons in a nuclear field. An analytical solution does not exist at 
all. This is where the variation method comes in. Certain eigenfunc¬ 
tions are chosen which satisfy the boundary conditions, that is, are 
equal to zero at infinity, as should be in the case of finite motion of a 
bound electron. These functions by no means satisfy the exact 
Schrodinger equation for the given problem and can only resemble 
it somewhat. 

If substitution into (32.16) yields a negative value of (Sf)- we can 
be sure that the actual energy eigenvalue in the ground state lies 
even lower, that is, it is also negative. But negative energy eigen¬ 
values indeed correspond to finite motion, or to the hound state of 
the electron. Usually the functions substituted into the expression for 
(Sf) involve a certain number of parameters at the discretion of the 
person carrying out the computation. If the point of interest is 
the ground-state energy, these functions should never vanish at finite 
distances from the origin of the coordinate system, which somewhat 
restricts their choice. After the integrals (32.16) and (32.17) have 
been calculated, the parameters are so chosen as to have {SO as 
small as possible while preserving the normalization of the func¬ 
tion. 

It should be noted that this method can be used only for determin¬ 
ing, more or less accurately, the energy. The wave function may not 
resemble the actual function very much. 

Degenerate Perturbation Theory. Suppose now that the rcth energy 
eigenvalue in the unperturbed state is degenerate. There are several 
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functions that satisfy the same equation 

<&{ otyn'ko = E n Q\\) n \ 0 (32.19) 

Here X may correspond to the eigenvalue of some operator that 
commutes with the Hamiltonian. Then it is necessary to choose 
coriectly the eigenfunction of zero approximation, ^ n0 , which should 
be given the form 

^nO = 2 a X^nX0 (32.20) 

x 

that is, it must be sought as a certain linear superposition of the 
initial functions. In this case the expansion (32.8) should be repre¬ 
sented as follows: 

C n 'k = &nn' a n X + Cn\ +•• • (32.21) 

The equation for finding E nl changes accordingly: 

En l^nX = 2 VnknV^nV (32.22) 

X' 

where V n x n is the matrix element between the states nX and nX’: 

Vn%.nV= j dx (32.23) 

The set of equations (32.22) has a solution if the determinant 
vanishes: 


I^n*nr-6u'£m| = 0 (32.24) 

In the most general case, the number of roots of the determinant 
is equal to the degeneracy multiplicity of the unperturbed state ty n0 . 
To each root there corresponds a definite eigenvalue En'l and a de¬ 
finite set of expansion coefficients a nX . Thus, the degenerate state 
splits into nondegenerate states; in the most general case their number 
is equal to the degeneracy multiplicity. 

Let us explain this from the physical point of view. Suppose that 
a degenerate unperturbed state was of the required kind, that is, it 
was due to a certain symmetry in the statement of the problem. For 
example, in any central field, if the total angular momentum is not 
zero, degeneracy occurs with respect to the magnetic quantum num¬ 
ber. If a perturbation violates central symmetry, then different ener¬ 
gies of the system may correspond to different projections of the 
angular momentum on some axis. For example, in a magnetic 
field H a certain term —p|II|/c, where (3 is the Bohr magneton, 
is added to the energy of the atom. A splitting of states occurs, but 
it is possible to find the correction to the energy without solving 
Eq. (32.24). This is because in the present case the perturbation 
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caused by the magnetic field is 

y=-P|H|M 2 =-p|H|-f^ (32.25) 

But the operator (h/i)(d/d cp) is diagonal in the states with a given 
value of the magnetic quantum number k. 

Perturbation as a Cause of Quantum Transitions. Consider the 
following problem. Let the Hamiltonian of an unperturbed system 
have states belonging to both discrete and continuous energy spectra, 
but such that they refer to the same energy values. We saw an 
example of such states in Section 29: an excited atom may have 
sufficient energy to disintegrate into a positive ion and a free elec¬ 
tron, but the latter remains bound, since otherwise the law of con¬ 
servation of parity would be violated. (We remember that if the 
Hamiltonian is an even function of the coordinates, total parity of 
the wave function is conserved in the system.) 

Another example of such a state with a discrete spectrum capable 
of passing into a state with a continuous spectrum is an excited atom 
plus an electromagnetic field. As long as the atom does not interact 
with the field the excited state can exist indefinitely simply because 
the atom has nowhere to transfer its excitation energy. But if an 
external perturbation—the electromagnetic field—is made to act on 
the system (or if we have taken into account the known additional 
components of the Hamiltonian, which can be treated as a perturba¬ 
tion), the state with a discrete energy level is no longer strictly 
stationary. 

Suppose that in the first example a certain perturbation, the opera¬ 
tor of which is not an even function of the coordinates, acts on the 
atom. This violates the parity conservation law, and transition to 
a continuous spectrum is possible. In the second example, the operator 
of an interaction between an electron and the electromagnetic field 
is included in the Hamiltonian. This interaction makes possible 
the emission of a light quantum. But the energy of the quantum is 
equal to h co, and the frequency co takes on a continuous set of values. 
Hence, the electromagnetic field energy has a continuous spectrum. 
Here, the initial state of the system is an excited atom and an electro¬ 
magnetic field in the absence of quanta, while the final state is an 
atom in ground state and one quantum in the field. 

The third example that should be cited is that of particlescattering 
on a force centre. Here, both particle states, the initial and the end, 
belong to a continuous spectrum, since the particle is free in them. 
If the scattering is elastic, the energy of the initial and final states 
of the particle is the same, and it is the direction of the momentum 
that changes. In this case the perturbation is due to the scattering 
centre, which transforms the particle from one state with a con- 
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tinuous spectrum to another but conserves its energy. In inelastic 
scattering a portion of the energy may be transmitted to the scatlerer, 
but the important thing is that the particle’s energy nevertheless 
continues to belong to a continuous spectrum. 

Our task is to determine the probability of a perturbation causing 
the energy of a system to pass into the continuous spectrum. 

Since we are dealing with a transition, we should proceed from 
the time dependent Schrodinger equation 

+ m (32.26) 

The corresponding equation for the unperturbed Hamiltonian has 
the form 


(32.27) 

Assuming V to be a small perturbation, we represent the wave 
function as 

= \|)(o) -|- \j)(i) (32.28) 

and neglect the term as being of second order. Then for 

we obtain a nonhomogeneous equation 

— — < ^’ 0 ^(D = F^(i) (32.29) 

We look for ^ 1 > in the form of an expansion in the eigenfunc¬ 
tions of <$ 0 : 

^ (1) = 2c m (0 (32.30) 

771 

each of , »|^ ) satisfying the homogeneous equation (32.27). Substitut¬ 
ing (32.30) into the nonhomogeneous equation, and taking into 
account (32.27), we arrive at the equation 

(32.31) 

771 

From this, with the help of the orthogonality property of 
we obtain the equations for the coefficients C m . We multiply both 
sides of (32.31) by and integrate over the volume. Then in the 
left-hand side there remains only the term —( hli)(dCJdt ), and in 
the right-hand side the matrix element of the perturbing potential: 

—r^r aV nl Ci (32.32) 
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We separate the time dependence explicitly: 
v ni = j ^ 0 >*F<> dx = e : (*»-**) '/ h 

X ( j < 0> * (0) FVi 0) (0) dx) f=o (32.33) 

Since the system as a whole is assumed to be closed, its Hamil¬ 
tonian does not depend upon time explicitly either in the term <$? 0 
or in the term V. Hence, the dependence of any matrix element 
upon time can be written similar to (32.33): 

V nm = e* M (V nm ) t=0 (32.34) 


Assuming that at the initial instant t = 0 the system was in a state 
with energy E u that is, in the state we must put C x = 1, 
Cs n -£\ = 0. Therefore Eq. (32.32) is integrated thus: 


Cn (0 


e i (En-Ei) t/h 
A n E i 


~ (?».),-0 


or, again including the exponential factor in the notation of the 
matrix element, that is, reverting to the functions i|5' rl 0) and ty[°\ 
we obtain 


Cn (0 


1_ e ~i (En — E\) t/k 

E n E j 


nl 


(32.35) 


Consequently, from (25.13) the probability that at time t the 
system will occur in the state with label n is 


W n (*) = | C n (t) | 2 

(\ e -i ( En-Ei) t/h J ^ e i ( E n -Ei) t/h 


= 2 (l- 


cos 


(E n -E t )’t \ | V,j |» 

h ) (En-Eif 

IZnll* 


\v ni 


= 4 s in 2 { x _ 

2 k X (E n -htf 


(32.36) 


By definition, the final state belongs to a continuous spectrum, 
so that we can write simply E instead of E n . Besides, it is more 
interesting to find the total probability of the system’s passing to 
a continuous energy-spectrum state. For this we must multiply 
(32.36) by the number of states belonging to the continuous-spectrum 
energy interval dE and integrate over the energy. An example of 
such an expression of the number of states, dN(E), was given in 
Section 28 (see (28.25)). 

We write Eq. (28.25) in general form as follows 


dN(E) = g (E) dE 


(32.37) 
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The total probability of the required transition is 
W= j w(E it E)dN(E) 

= ] 4SiDM ( ( /rff f/(2fe>1 I V (E, E t ) | 2 g (E) dE (32.38) 

For the sake of clarity we shall write the indices E, E x of V not 
as subscripts but in parentheses, like the arguments of a function, 
which they actually are with respect to Veei- Denoting the sine 
argument by g, (E — E x ) t/(2h) = g, and passing to the integration 
variable g, we obtain 

w= ^r J +2m ' E i)\ 2 g( E i+ 2h ^ t ) d ^ 

(32.39) 

The function (sin 2 g)/g 2 has its principal maximum at 5 = 0; 
its next maximum^is smaller by a factor of twenty. Therefore the 
main contribution to the integral (32.39) comes from the values of 5 
of the order of unity. But then the time t can always be so chosen that 
2h\!t is much smaller than E x . In other words, in the arguments of 
the functions V and g we can legitimately replace E x + 2 h\!t simply 
by E x and take these functions outside the integral with respect to g. 

We have thus demonstrated that if t is of sufficient duration, 
then the energies of the initial and final states, E± and E, are defined 
so precisely that they can simply be assumed equal, in agreement 
with the energy conservation law in a transition occurring in a 
closed system. Of course, the energy conservation law always holds, 
but for too small values of t it is impossible to define the energy of the 
final state accurately, since the uncertainty relation (31.37) for this 
case has the form (E — EJ t ^ 2nh. But when oo, we obtain 
an exact equality, E =* E x . 

Since the functionj[(sin 2 g)/£ 2 decreases rapidly with increasing g, 
the integration may be extended from — oo to oo. Since the other 
quantities were taken outside the integral sign, the integral itself 
is equal to the number 

f sin 2 1 ~ 

J 5* ^ 

— oo 

Hence 

W = ^-\V(E = E U EM z g{Ei)t 
Then the probability of a transition in unit time is 

-it =nr I v ( E = E i) I 2 e ^ < 32 - 42 ) 


(32.40) 

(32.41) 


28-0452 
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In this notation it is specifically stressed that V (E = E i, E i) 
is not a diagonal element of the perturbing potential, but only the 
matrix element corresponding to the transition of the system to 
a state with a continuous energy spectrum. 

The initial and final states were mutually degenerate, that is, 
they corresponded to the same energy, and the perturbation “mixed” 
them. 


EXERCISES 


1. The potential energy of a system is a homogeneous function of the 
nth degree in all the coordinates. Find the relation between the integrals 

(T)= — j (U) = j dV 

that is, between the mean kinetic and mean potential energies of the system. 

Solution. Suppose that the length scale has changed a-fold, r -> ar. 
Since remains normalized, the product dV does not change with the 
scale. The kinetic energy, which involves the Laplacian, receives the factor 
a” 2 , and the potential energy receives, by definition, the factor a n . Hence 
the mean energy of the system {&0) = (T) + < U) transforms in the follow¬ 
ing way: 

(#e a ) = ^ + a.n(U) 


But from the extremality of eigenvalues, the derivative of > in station¬ 
ary state must be zero: 


d(W a ) 

da 


2 (T) 
a 3 


-f- na n_1 (U) = 0 


Reverting to the initial scale, we put a = 1, so that 


In particular, for the Coulomb interaction n = — 1. Then 
(U)=-2(T), <<§5S>) = -i- <cr> 


as was obtained in Section 22 on the basis of classical laws. 


2. A linear harmonic oscillator is subjected to a perturbation of the 
form V = ax A . Show that in the first approximation the correction to the 
nth energy eigenvalue is equal to 



(make use of the matrix elements (27.28a)). 
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3. An unperturbed system has two close energy levels the difference 
between which is comparable with the matrix elements of the perturbing 
potential between these states. Determine the energy correction in the first 
approximation. 

Solution. By analogy with the perturbation method applied to degener¬ 
ate states, when the energy eigenvalues of different states are strictly equal, 
we seek the wave function in the form 

^ = C^oi + £ 2^02 

Here, af> 01 and af> 02 are the wave functions of both states without the perturba¬ 
tion. This wave function will describe two new states to which the initial 
states transform as a result of the perturbation. We substitute this function 
into the equation for finding the eigenvalues of &6 = + V: 

= (<^?o + 10 (^1^01 + ^2^02) = Ety = E (Ciipoi + ^2^02) 

We multiply this equation once by and once by and integrate 
over the whole volume. Besides, we make use of the fact that c&’o'^oi = 
= and of the orthogonality and normalization of the eigenfunctions. 

We obtain a set of two linear homogeneous algebraic equations: 

(Eto + Vtt-E) ^ + F 12 C 2 = 0 

1 21^1 + (E 20 + V 22 — E) C 2 — 0 

It has a solution, provided the determinant developed from the coefficients 
vanishes: 

E 2 - (E i0 + E 20 +V n + V 22 ) E + (E l0 + V n ) (E 20 + V 22 ) - 1 V 12 |* = 0 
which yields two values for the “perturbed” energy: 

E — ~2 (^10 + ^20 + ^11 + ^ 22 ) 

If the transformation matrix element V 12 vanishes, we obtain a conven¬ 
tional formula for first-approximation energy corrections: to E 10 is added 
V u , to £’20 is added V 22 . From the condition of the problem the matrix 
element V 12 is of the same order as E 10 — E 20 , so that the energy levels are 
substantially rearranged with respect to the initial configuration. 

4. Find the functions of>+ and of the preceding problem, taking into 
account that | C x | 2 + | C 2 | 2 = 1. Make use of the notation 

_ ?Fi 2 _ = _ t tP 

EiO + V U -E„-V„ - 

A nswer. 

t|5 + = cos-2- e’P/ 2 t|?oi — sin —■ e _i P /2 ij ) 0 2 
if - = sin-2- e*P/ 2 if 01 -f cos -2- e~»P /2 i|) 02 

where the + and — signs correspond to the choice of signs in the energy 
formula. 


28 * 
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MANY-ELECTRON SYSTEM. THE ATOM 

In examining the law of addition of the angular momenta of several 
particles or the total parity of the state of an atom, we touched upon 
questions associated with the many-body problem in quantum 
mechanics. But in quantum mechanics a system consisting of iden¬ 
tical particles possesses a very special property stemming from the 
quantum laws of motion; this property will be examined in the pres¬ 
ent section. It involves the physical identity of like particles. Since 
it is impossible to trace the motion of every such identical particle— 
there is no such thing as path in quantum mechanics—there is no 
way of indicating the state of a certain selected electron. 

In classical mechanics, the state of affairs is rather different: at 
the initial time one can adopt some convention for numbering 
identical particles and then, by virtue of the continuity of classical 
motion, indicate the specific electron at a given point of space mov¬ 
ing with a given velocity. 

In quantum mechanics such a statement of the problem is physic¬ 
ally meaningless. Instead, the state of a system of many identical 
particles must be defined as follows: list the possible states of an 
individual particle and indicate the number of particles to be found 
in each of them. A more detailed definition is incompatible with 
the fundamental principles of quantum mechanics. 

Pauli’s Exclusion Principle. In the specific case of a many-electron 
system, experimental data impose the following additional restric¬ 
tion: not more than one electron can be found in each of the separate 
states. A state may be either occupied or unoccupied by one electron. 
This statement is known as Pauli's exclusion principle. It is an 
additional restriction which does not derive from the fundamentals 
of quantum mechanics we investigated up till now. Its confirma¬ 
tion is to be found in the quanlum field theory. 

We confine ourselves to the following statement: Pauli’s principle 
is applicable to all particles possessing half-integral spin, that is, 
electrons, neutrons, and hyperons (hyperons are, so to say, the 
excited state of a nucleon: they have a greater mass than a nucleon, 
and half-integral spin); it does not apply to particles with zero or 
integral spin, among which electromagnetic field quanta may be 
listed. 

With respect to the atom, Pauli’s exclusion principle is conve¬ 
niently formulated as follows: in one and the same atom no two 
electrons can have the same set of values for the four quantum num- 
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bers: the principal quantum number n, the orbital quantum number Z, 
the magnetic quantum number k, and the spin quantum number a. 
(The spin quantum number is a measure of the spin projection on the 
same axis on which the orbital angular momentum is projected.) 

Sometimes, instead of the four above numbers it is useful to 
adopt the following: the principal quantum number, the total 
angular momentum 7 = | 1 + s |, the orbital quantum number, 
which in the present case states how the orbital angular momentum 
is added to the spin (that is, whether they are parallel or antiparallel), 
and the projection of the total angular momentum on an axis. 

According to this choice of quantum numbers, the following 
system of notation is adopted. First we write the principal quantum 
number of the electron, n\ then the orbital quantum number, but 
rather than its value we write the corresponding state, s, p, d, or /; 
the value of the total angular momentum, 7 , is written as a subscript. 
Thus, the notation contains three of the four quantum numbers. 
With respect to the fourth number (the projection kj) there is a 
degeneracy. For a given 7 this fourth number may take 2 7 + 1 values. 
Consequently, according to Pauli’s exclusion principle, no more 
than 2 7 + 1 electrons may occur in an nlj state. 

The number of electrons in an atom having the given three quantum 
numbers n, Z, and 7 , is denoted in the form of an exponent attached 
to the spectroscopic notation of the state taken in parentheses (the 
number of electrons being read as one would an exponent: square, 
cube, etc.). For example, if there are two electrons in a state with 
n = 2, Z = 1, 7 = 1/2, the state as a whole is written (2p!/ 2 ) a . 
Obviously, the exponent cannot be greater than 2 7 + 1. 

Addition of the Angular Momenta of Two Electrons Having the 
Same n and Z. When should the quantum numbers ra, Z, Zb, o be used, 
and when n, Z, 7 , kj! In Section 31 it was shown that spin-orbit 
interaction is due to magnetic forces. But as we shall see later, more 
important is a special type of interaction between the orbital and 
spin angular momenta of different electrons, which is of a purely 
electrostatic origin. In different cases one or another type of interac¬ 
tion, or coupling, may predominate. 

If the angular momenta of individual electrons are more strongly 
coupled, the resultant atomic state is developed as follows: the 
orbital angular momenta combine, yielding a total orbital angular 
momentum L = | 2 1 I? the spin angular momenta combine, yield¬ 
ing a total spin S = | S 8 h and only then the total orbital and 
total spin angular momenta combine, due to magnetic forces, into 
the total angular momentum J = | L + S |. This type of coupling is 
called normal. 

If the spin and orbital angular momenta of each electron combine 
first, and after that the angular momenta of the individual electrons 
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combine into the total angular momentum, the coupling is said 
to be anomalous. 

Leaving aside for the time being the question of the causes re¬ 
sponsible for one or another type of coupling, we shall examine the 
rules for the addition of the orbital and spin angular momenta of 
individual electrons, which differ somewhat from the general rules 
for the composition of momenta due to Pauli’s exclusion principle. 
Certain additional restrictions appear which are due to the fact that 
no two electrons can occur in a state with the same four quantum 
numbers. The following very simple example offers an idea of how 
Pauli’s exclusion principle is applied. 

If two electrons have different principal quantum numbers n, 
or different orbital quantum numbers Z, in adding their angular 
momenta Pauli’s exclusion principle can be neglected: at least one 
of the quantum numbers differs. But if n and l are the same, it 
should be borne in mind that not all conceivable states of one electron 
are compatible with the states of the other electron. The same refers 
to the addition of the angular momenta of a greater number of 
electrons with the same n and Z. Such electrons usually have very 
close energy values, and the states of atoms are grouped according 
to them; it is said that the electrons with the same n and Z belong 
to the same shell of the atom. 

Let us consider the simplest case of n = 1. Then, from the defini¬ 
tion (29.40), Z = 0. But at zero Z the magnetic quantum number k is 
also zero. Hence, the electrons have the same three quantum numbers, 
and according to Pauli’s exclusion principle the fourth quantum 
number a must necessarily differ. But a can have only two values, 
1 or —1, according to the two spin projections ±1/2. From Pauli’s 
exclusion principle, each value of a can belong to one electron with 
the given n, Z, and k , equal respectively to 1, 0, and 0. Hence the 
resultant state possesses spin S = 1/2 — 1/2 = 0. If Pauli’s exclu¬ 
sion principle were not taken into account, the total spin could be 
unity. 

Before considering the more general case, we shall introduce a 
system of notation referring to the shell as a whole. It is analogous 
to the designation of the states of a separate electron, only instead 
of lower-case Latin letters we use upper-case letters. The concept 
of principal quantum number as applied to several electrons is mean¬ 
ingless, and we write only the total orbital angular momentum 
S , P, Z), or F, according to what L is equal to. The term 2 S + 1, 
where S is the total spin of the electrons, is written as a superior 
prefix to the symbol; the total angular momentum J is written as 
a subscript, and the total parity of the electrons is written as a 
superscript. Even states are called gerade states and labeled with 
the superscript g ; odd states are called ungerade states and are labeled 
with the superscript u. If it is necessary to specify in greater detail the 
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states of the individual electrons that yielded the resultant state, 
the shell distribution of the electrons according to quantum numbers 
is attached in the notation described above. 

For example, in the case of two electrons with n = 1, 1 = 0, 
k = 0 , the resultant state of the shell is denoted as: 

(lsi/ 2 ) 21 ^ 

Here, only one resultant state is possible. But several resultant 
states are also possible for the same distribution of individual elec¬ 
trons over different quantum numbers. We shall now examine such 
a case. 

Let there be a state ( np ) 2 , that is two electrons having the same 
principal and orbital quantum numbers, equal to 1. Either their 
magnetic or their spin quantum numbers must differ, or both. 
A p-electron can be in one of six states, which we list writing the 
magnetic quantum number first and the spin projection second: 

A: B : 0,~; C : —1,-1-;. D : 1, —±- 

E: 0, -i-; F: -1, 

It follows that two electrons can occupy any two different states 
of the six. As is known, the number of combinations of six things, 
two at a time, is equal to C (6, 2) = (6 X 5)/(l X 2) = 15. These 
fifteen states differ in the total orbital angular momentum L and 
the total spin S, as well as in their projections. The latter depend upon 
the choice of coordinate axes and will interest us only insofar as 
they characterize the relative directions of L and S. 

As we know, a state with a given angular momentum is defined 
by its maximum projection, that is, by the maximum possible value 
of L compatible with Pauli’s exclusion principle, and corresponding¬ 
ly, the largest projection of S. Of the fifteen states, we must in any 
case select only those for which the total projections, L z and S z , are 
positive or zero, since negative projections cannot, obviously, be 
the largest. 

Eight of the fifteen states have positive projections, and from 
those eight we select the ones with the largest. We rewrite all eight 
states compatible with Pauli’s exclusion principle: 

AB: 1, 1; AC: 0, 1; AD: 2, 0; AE: 1, 0 

AF: 0, 0; BD: 1, 0; BE: 0, 0; CD: 0, 0 

The second number in each pair denotes the spin projection, that is, 
it involves the necessary factor 1 / 2 . 

Now we take the states with the greatest angular momentum pro¬ 
jections from among those listed. We start with AB. To it corresponds 
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an orbital angular momentum projection equal to unity and a spin 
projection also equal to unity. Each of them may take on zero values 
(we agreed in advance not to consider negative values). Consequently, 
we can disregard AC , AE, and AF at once. Of the remaining, let us 
consider AD. An angular momentum with the maximum projection 2 
has positive projections 2, 1,0; hence, BD and BE should be discarded 
as possible projections of AD. There remains CD , which has no 
projections. Obviously, it does not matter whether we take AE 
or BD for the projection of AB: this does not affect the counting 
of the number of states. 

Let us now write the resultant states in spectroscopic notation, 
taking into account that the total angular momentum varies from 
L S to J L —— S |* 

AB: (np) 23 P g , (np) 23 P 9 u (np) 23 P§ 

AD: (rap) 21 £>f 

CD: ( np) 2l S g 0 

Let us also consider the case of three p-electrons. For them we 
have seven states compatible with Pauli’s exclusion principle: 

ABC: 0,; ACE: 0, ABD: 2, \ ; ABE: 1,-i- 

ABF: 0,i-; ACD: 1,-1; BCD: 0,i- 

The maximum spin projection is 3/2, for a zero orbital angular 
momentum projection. The maximum orbital angular momentum 
projection is 2, the total spin projection being 1/2. These two states, 
together with their projections, are the ones listed above from ABC 
to ABF . There remains the state ACD , with respect to which BCD 
can be considered its projection. Thus, the resultant states are 

ABC: ( np)**S u 3/2 

ABD: (np)**D u bl2 , {npf*D u m 

ACD: (np)**P u 3l2t {npf*P u m 

The Wave Equation for a Two-Electron System. Let us now formu¬ 
late Pauli’s exclusion principle with the help of wave functions. 
To avoid mathematical complications, we shall consider a two- 
electron system. The wave equation for two electrons should be 
written as follows: 

$e<S> = ( -11 V ?- ~ VI + u (r lt r 2 )) O = £0! (33.1) 

Here, V? and V 2 are the Laplace operators with respect to the 
variables of the first and second electrons, and U (r 1( r s ) is the 
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potential energy of their interaction with the external field (ext) 
and between themselves (int): 

u (ri,r 2 ) = C/ext(ri,r 2 ) + C/ lnt -(r li r 2 ) (33.2) 


For example, in a helium atom 


U (ri, r 2 ) = 


2£2_2£2 

H r 2 


ri — r 2 l 


(33.3) 


The wave function depends on the spatial and spin variables of 
both electrons: 


(D = 0 (n, si, r 2 , s 2 ) (33.4) 

The interaction between the spin and orbital angular momenta 
is weak, at least in the case of low-Z elements (see Sec. 30). Therefore 
in the potential energy operator we can, in the first approximation, 
neglect the spin-orbit interaction, which corresponds to (33.3). 
If the effect of the spin on the orbital motion is small, the probability 
of a certain value of the spin and the coordinate is equal to the 
product of the probabilities of both values, and the probability 
amplitude O also separates into a product of amplitudes: 

0(r,, r 2 , T i )%(s u s 2 ) (33.5) 

The probability amplitude of the orbital motion satisfies (33.1), 
provided it does not involve the spin operator. When the system 
is placed in an external homogeneous magnetic field H, to the 
Hamiltonian is added the operator 

mag = [(Si • H) + (a 2 • H)] = p | H I (a lz + o 2z ) (33.6) 

where the z axis is taken in the direction of the field (the sign has 
been changed to “+” because the charge of the electron is negative). 
The operation of the operator a lz + z 011 the spin function % 
yields simply the total projection of the spin of both electrons, which 
in the absence of spin-orbit interaction can be treated as an integral 
of the motion, that is, as a number. This number is simply added to 
the Hamiltonian, so that the equation for O does not alter its form. 

Examining the operator $6 in (33.1), we see that it is completely 
symmetric in the coordinates of both electrons, that is, it does not 
change its form if the first electron is called the second, and the 
second the first: 

SB (ri, r 2 ,s 2 )=*<$? (r 2 ,s 2 ; T U s t ) (33.7) 

But (33.1) is a linear equation; hence, if it does not change in 
the operation (33.7), the wave function can only be multiplied by 
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a constant number P : 

<D (r lt r 2 , s 2 ) = (r 2 , s 2 ; ri, Si) (33.8) 

Since r x , Si and r 2 , s 2 are involved in the same way in all the equa- 
tions, in (33.8) they can be interchanged, yielding 

O (r 2 , s 2 ; t u s x ) = PO (r 4 , r 2 , s 2 ) (33.9) 

We substitute (33.9) into (33.8) to get 

<D (ri, r 2 , s 2 ) = P*<£> (ri, s t ; r 2 , s 2 ) 
or 

P 2 = 1, P = ±1 (33.10) 

In this comparatively simple case of two particles the interchange 
of all coordinates (space and spin) is similar to the symmetry trans¬ 
formation of their corresponding wave functions in reflection (see 
(29.44)). 

We now introduce the exchange operators P T and P s , which operate 
only on the electrons’ coordinates and spins. Thus 

Wri,r,) = Y(r 2 , ri ) (33.11) 

and 

Pe%(Sl,S 2 ) = %(S 2 ,S i ) (33.12) 

If the wave equation is symmetric with respect to the interchange 
of r x and r 2 , or of s x and s 2 , then the eigenvalues of P T and P s are 
equal to ±1. 

We denote the set of “spatial” quantum numbers of the first electron 
by the letter n x (instead of n x , l u and ki), and of the second electron 
by the letter n 2 . Then the spatial wave function can be written in 
greater detail as 

Y =W (n u r x ; n 2 , r 2 ) 

and from the requirement (33.10) it follows that 

W*i,ri; n 2 ,T 2 )= y ¥ (n U T 2 ; w 2 , r 4 ) = ± Y(wi, r 4 ; n 2J r 2 ) 

(33.13) 

In the case of the upper sign, function (33.13) is said to be symmetric , 
in the case of the lower sign it is antisymmetric . The spin function 
X ( a n s ij a 2 » $ 2 ) possesses similar properties with respect to the 
exchange operator of the spin variables s ± and s 2 . 

Now let us consider the requirements Pauli’s exclusion principle 
imposes on the wave function of a two-electron system. We write 
the function in the form 

O (n x , o 1? t ±j s ± ; n 2J a 2 , r 2 , s 2 ) 
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A total exchange of the spin and spatial variables in this function 
is due to the operation of the operator P equal to 

P = P r P s (33.14) 

Operating with (33.14) on the function O, we obtain 

PO (w*, di, r^, n 2 , d 2 , r 2 » ^ 2 ) 

= 0(72i,ai, r 2,5 2 ; n^o^Ti.Si) (33.15) 

From (33.10), this function is also either symmetric or antisym¬ 
metric. But it is immediately apparent that only the antisymmetric 
function satisfies Pauli’s exclusion principle. Indeed, let the states 
of both electrons be identical, that is, n x = n 2 and = a 2 . Then, 
if O is antisymmetric, we have 

Po (re lt CTi, s,; n u a u T 2 ,s 2 ) 

= O (ttj, (Jj, r 2 , s 2 ; Wi,o 1 ,r 1 ,s 1 ) 

= —$(«!, o u rj,^; 

= 0 ) (rei i o 1 ,r 1 ,s 1 ; CT lt r 2 , s 2 ) (33.16) 

By definition, P interchanges only the variables r and s, but not 
the quantum numbers n and a. The first equality in (33.16) denotes 

the result of the operation of P, the second takes account of the 

antisymmetry of the wave function, while the third is obtained 
from the first expression for O by means of an exchange of both 
quadruplets of arguments referring to each electron separately. The 
possibility of such an exchange for any function is apparent, since 
it is immaterial which electron is assumed the first and which the 
second. 

The exchange of the first quadruplet (n ± , a lT r lT s x ) with the 
quadruplet (n ± , o^, r 2 , s 2 ) in the last equality in (33.16) is simply 
meaningless: it makes no difference which arguments we write first, 
those referring to the first or to the second electron, that is (n v a lT r lT ^) 
or (n x , a x , r 2 , s 2 ). Thus, the function O (n ± , a lT r lT s x ; n x , a x , r 2 , s 2 ) 
is equal to itself with the sign reversed; hence, it is identically zero. 

It is apparent that, given the same quantum numbers, only an 
antisymmetric function possesses this property: a symmetric one 
would transform into itself. But if the antisymmetric function of 
two electrons in the same states is identically zero, then the ampli¬ 
tude of the probability of a two-electron system being in such a state 
is equal to zero for all values of the variables r lT r 2 , and s 2 . Only 
an antisymmetric function is compatible with Pauli’s exclusion 
principle. 
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The same holds for the wave function of a many-electron system: 
it is antisymmetric with respect to a simultaneous interchange of 
the spatial and spin variables of any pair of electrons. This is the 
general formulation of Pauli’s exclusion principle. 

Self-Consistent Field. At the beginning of this section it was 
pointed out that the state of an atom can be described by stating 
the number of electrons in a state with given quantum numbers: 
one electron or none. But what do the quantum numbers themselves 
denote in a many-electron system? 

Every electron is assumed to be in the field of the nucleus and 
of all the other electrons; such a field is called self-consistent . On the 
basis of such a model of a many-electron atom it is even possible to 
compute approximately the energy levels of atoms, as was first 
done by D. R. Hartree. 

V. A. Fock substantially improved the self-consistent field 
method by taking account of electron identity and Pauli’s exclu¬ 
sion principle. Fock’s method is based on the variation property of 
energy eigenvalues, that is, the extremal property of the integral 
(32.16). In this, the wave function of a two-electron system (we 
restrict ourselves to two electrons) is in the first approximation 
chosen in the form of the product of the wave functions of each 
electron separately. Then these functions are so chosen that the 
integral (32.16) has an extremum, on condition of retaining the 
normalization of both functions in accordance with (32.17). 

In order to satisfy Pauli’s exclusion principle at the same time, 
the initial wave functions should be selected not simply as a product 
of the functions, ^ (r^ \|? 2 (r 2 ), but as a superposition of the form 

^ ( r i) *M r a) ± ’I’i (r 2 ) Vi ( r i) ] ( 33 • 17 ) 

(the wave function subscripts 1 and 2 are quantum numbers). 

In the case of the upper (plus) sign this coordinate function is 
multiplied by an antisymmetric spin function %, and for the lower 
(minus) sign, by a symmetric spin function. Such functions will be 
developed later in this section; for the time being we accept the 
assertion that when the symmetries of the spatial and spin wave 
functions are opposite, the whole function O is antisymmetric. 

If the Hamiltonian of a two-electron system does not depend upon 
the spins, then the expression (32.16) retains only the integration over 
spatial variables, while summation over the spin variables in every 
case yields unity. “Memory” of the spin survives in the spatial wave 
function only in the sign between the terms. 

The factor 1/1/2 in (33.17) is selected for purposes of normaliza¬ 
tion: if and \|) 2 are normalized and orthogonal, then W is also 
a normalized function, and (32.16), with the wave function (33.17), 
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can be treated as the mean energy. The normalization of W is verified 
in the following way: 

j | > -dV i dV 2 

=4J kiP^i J N>i|w a 

±Y j | ±4 J ^ 2^2 I ^ 1^1 

= 1 (33.18) 

We express the two-electron Hamiltonian as follows: 

SS = SS r -\-SS f (r 2 )-\-V^ 2 (33.19) 

Here $£' (r x ) and SB 9 (r 2 ) are of the same form, but they depend upon 
the variables of the first and second electron, while V 12 is the in¬ 
teraction Hamiltonian, equal to e 2 \ r t — r 2 | _1 . 

Now, substitute the wave function (33.17) into (33.19) and inte¬ 
grate over dV 1 and dV 2 • Here we can always redesignate the integra¬ 
tion variables, that is, in each term replace r x by r 2 and vice versa. 
As a result each term occurs twice, and the 2 cancels out with the 
normalization factor 1/2. We thus obtain 

(38) = [ (r t ) ^ (r 2 ) 

J 

X [38' {ri) +38' (r 2 ) 4- V 12 ] (ri) \|) 2 (r 2 ) dV t dV 2 

± J ^ (ri) 4? (r 2 ) 

X \M' (r,) +38' (r 2 ) + F 12 ] ^ (r 2 ) t|> (r,) dV l dV 2 
Taking into account that and are orthogonal and normalized, 

(38) = J ^38' (r) % dV + J ytm' (r) i |> 2 dV 

+ J (ri) (r 2 ) V i2 i|)j (r t ) \|) a (r 2 ) dV t dV 2 

± j 4? (ri) 4? (r 2 ) F^ (r 2 ) a |) 2 (r t ) dV t dV 2 (33.20) 

Let us examine the meaning of the various terms in the mean 
energy. The first two define the mean energy of the separate electrons 
in the first and second states. The third term is the energy of interac¬ 
tion between the electrons, since ety* (r x ) op! (r x ) is the charge density 
of the first electron, is the charge density of the 

second electron, and e 2 \ ap 1 (r 1 ) | 2 | / vp 2 ( r 2 ) l 2 l r i — r 2 I ” 1 dV 1 dV 2 
is the interaction energy of two charge elements. This quantity is 
developed according to the classical law, and its appearance is 
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obvious. The last term is of a quantum nature: it appears as a con¬ 
sequence of symmetrization of the wave function (33.17). It is called 
the exchange integral , or exchange energy , of the two electrons. The 
term “exchange” was adopted because the electron is as it were si¬ 
multaneously in both states. 

The sign of the exchange integral depends upon the spin state of 
the system. Therefore, even though we neglect the magnetic interac¬ 
tion of the spins, it is necessary to take into account their interaction 
through the exchange integral. Here, the interaction cannot be 
reduced to some “force”: it is due to purely quantum properties of 
wave function symmetry stemming from Pauli’s exclusion principle. 
The exchange interaction is greater than the magnetic interaction, 
because it is, in the final analysis, due to the Coulomb potential 
energy of two electrons, e 2 | r x — r 2 I” 1 , and does not contain c 2 
in the denominator. 

We now vary the integrals (33.20) with respect to ij)* and if)*, 
provided the functions are orthogonal, and additionally j dV= 

= 1 and £ , i|?*'i |? 2 dV = 1. As usual, the variations of these two 

expressions are multiplied respectively by the parameters — E x 
and — E 2 , and the variations of the orthogonality conditions, 

j dV = 0 and ^ dV = 0, by two other parameters, 

—r]! and —r) 2 . Denoting the variation with respect to as 8 ^*, 
and with respect to if* as yields the extremal condition for (M) 
for the additional normalization conditions: 

<<#?> — E M*^i — t 1 i 8^2 = 0 (33.21a) 

8 ^* (&6) — £' 26 ^ 2^2 — ^ 26 ^ 2^1 = 0 (33.216) 

After performing the variation we arrive at a system of two 
integro-differential equations: 

<#T( r )ih(r)+e 2 j 

± e 2 j ■ *1 W = ( r > + ^ M ( 33 * 22a > 

<&'(*)t2(r) + e 2 j 1 \ V ( r ) 

±e 2 j ^ (r) = e ^ 2 (r) + ^ (r) ( 33 . 226 ) 
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Multiplying (33.22 a) by (r) and by \|)* (r), and integrating 
over V , we obtain the expressions for E 1 and 


I ( r) l 2 1 (O I 2 d y d y 


, „ 2 f I ti (r) I 2 I 

+ J |7=F1 

+ e 2 j' t* (<•') ^2 (r) (r) Tfo (r') dV dV' (33 23 ) 

%={ (1)^(1) ^t(T)dV 

, „ 2 f Uz(r') 

+ J F=7T 

± e 2 j I ^2 (r ; ) I 2 ^2 (r) ^1 (r) dV dV ^ 24) 


Similar expressions are developed for E 2 and r) 2 . 

In the solutions it is always possible to separate out the angular 
dependence of the wave functions, so that only the equations in¬ 
volving the dependence upon r remain. The final set of equations can 
be solved by computers, whereas the exact Schrodinger equation for 
the two-electron problem so far defies solution by any means. The 
degree to which the solutions obtained from Fock’s equations agree 
with experimental data is in a number of cases quite satisfactory. 
As always, in employing the variational method, the energy eigen¬ 
values agree with experience better than any integral expressions 
obtained with the help of the wave functions determined together 
with the energy. 

The self-consistent field method of representing the wave functions 
offers an understanding of the reason why, in adding the angular 
momenta of several electrons, some values of the total angular 
momentum correspond to smaller total energies than others. 

Let us take as an example two p-electrons. The angular parts of 
the wave functions of the p state are first-order spherical functions, 
that is Y\, Y° v Y-'i 

Y\ = cos fi*, yf 1 = ± sin ft e ±i(p 


The electron density distribution corresponding to these func¬ 
tions is 


|yj| 2 =cos 2 fl, |yj | 2 =|y 7 i | 2 

In other words, for yj the distribution is elongated along the 
polar axis, while for y^it is flattened in the “equatorial” plane. 10 

10 Figuratively speaking, to nonzero angular momentum projections cor¬ 
respond greater “arms”. 
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The electrons’ Coulomb repulsion energy is, as we have just seen, 
expressed by the integral 

.2 f I ^1 ( r l) I 2 I ^2 ( r 2) I 2 dVidV 2 

) 

If two electrons are in the same orbital states (and consequently 
have opposite spins), their repulsion energy is greatest, because 
they are moving in one spatial domain; if the orbital electrons are 
in different orbital states, the energy is smaller, since the electrons 
are moving at a greater distance from one another. 

If we now consider states with magnetic quantum numbers ±1, 
that is Y\ and Y7 1 , from the Coulomb repulsion formula we should 
expect them to display the same interaction energies between an 
electron pair with angular momentum projections +1 as between 
a pair with projections + 1 and —1, since the same electron density 
corresponds to ±1. Actually, however, there is a difference, due to 
the exchange term, though it is nevertheless smaller than in compar¬ 
ing an electron pair having the same or opposite projections with a 
pair having projections equal to 1 and 0, that is Y^YT 1 with Y\Y\. 

Consequently, in adding the angular momenta of separate elec¬ 
trons, the least-energy state is achieved when the spatial wave 
functions have the smallest overlap. If the squares of their moduli 
arejthe same, as in Y\ and Y7 1 , the exchange effect alters due to the 
overlap, but it remains greater than for functions with 0 and 1 
angular momentum projections. 

Therefore, when the angular momenta of two p-electrons are 
added their spin must have the maximum value, so that the projec¬ 
tions of orbital angular momentum would correspond to the least 
overlapping wave functions, Y\ and YJ. In consequence, we find 
that a system of two p-electrons has the largest spin, 1, and for this 
greatest spin the greatest total orbital angular momentum. We have 
arrived at what is known as Hand's first rule . 

If we have three p-electrons, the orbital angular momentum of the 
third must already have the magnetic quantum number —1, and 
the total spin is then 3/2. From the energy point of view this is more 
advantageous than if the spin of the third electron was in the oppo¬ 
site sense of the first two electrons, which would be in the same 
orbital state. Hence, of the three possible resultant states of a 
system, ( np ) 3 , found above, the 4 S 3 / 2 state has the least energy. 
On the other hand, the states 2 D 3 / 2 5/2 and 2 P i /2 3 / 2 have two coincid¬ 
ing projections of the electrons’ orbital angular momenta: for 
state 2 D 3 / 2 5/ 2 this yields two wave functions Y\, for state 2 P 1/2 3 / 2 , 
two functions YJ. This confirms Hund’s rule: the spin must be put as 
large as possible, and then for the given spin we seek the maximum 
possible value of the orbital angular momentum. 
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The example cited above refers to the nitrogen atom, which has 
the configuration ( np ) 3 . The least-energy state is 4 5 3 / 2 , as expected; 
the 2 D 3/2 5/2 states lie 2.2 eV higher, and the 2 P i/ 2 3/2 states lie 
3.8 eV higher. The latter are highest because when two electrons 
are moving close to the polar axis in a Y\ state, they are on the whole 
at a smaller distance than when they move close to the equatorial 
plane. This can be easily seen by constructing polar diagrams of 
sin 2 d and cos 2 d and rotating them about the polar axis. 

This provides an explanation of normal coupling of the angular 
momenta of separate electrons. The coupling is due to the Coulomb 
repulsion and to Pauli’s exclusion principle. In atoms that are not 
too heavy, normal coupling due to Coulomb forces always exceeds 
the spin-orbit coupling, which is due to magnetic forces. 

This is the way orbital and spin angular momenta of separate 
electrons are added. However, between the total orbital and total 
spin angular momenta of a system of electrons there exists a magnetic 
interaction similar to that found for an individual electron. 

Unlike a separate electron, the spin of a system of several electrons 
may be greater than 1/2: by Hund’s first rule, larger values of spin 
are preferred. Therefore, in the most general case the total angular 
momentum J has, instead of two possible values, as many values as 
there are values of the vector sum J = | L + S |. If L > 5, the 
sum has 25 + 1 values, if L < 5, it has 2L + 1 values. 

Multiplets. If there were no spin-orbit magnetic ^interaction, each 
term could freely orient by itself. This would result in a (25 + 1) X 
X (2 L + l)-fold degeneracy. Magnetic interaction partially lifts 
the degeneracy: when the orbital angular-momentum vector is fixed, 
a magnetic field is as it were created; it possesses axial symmetry 
for which to each spin projection there corresponds a specific spin- 
orbit interaction energy. (If the spin is larger a similar reasoning 
applies to the orbital angular momentum.) 

If the projection of the smaller angular momentum on the larger 
is given, then, apparently, the total angular momentum is deter¬ 
mined according to the general rule for adding angular momenta. The 
total angular momentum again can rotate freely in space, but with 
(2 J + l)-fold degeneracy. Thus, the magnetic interaction splits 
(25 + 1) (2 L + 1) levels into 25 + 1 or 2L + 1 levels with the 
given value of the total angular momentum J, each level with 
definite J being (2 J + l)-fold degenerate. The totality of all these 
levels is called a multiplet, and the levels themselves are known as 
-fine-structure levels. 

A multiplet has 25 + 1 or 2L + 1 components. One of them 
corresponds to the least energy. It is determined as follows. For 
a given orbital quantum number l there may be 2Z + 1 values of the 
magnetic quantum number, and for a given magnetic quantum 

29-0452 
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number there are two spin projections. Hence, there can be 
2(2 1 -f- 1) electrons in the given configuration. It can readily be 
observed that if there are 21 + 1 electrons in it, the total orbital 
angular momentum should be zero. Then the wave functions of all 
the separate electrons are different, and the Coulomb repulsion 
energy is least. This was examined in greater detail in the example 
on the ( np ) 3 configuration. 

Thus, filling of the 2 (21 + 1) states takes place as it were in two 
stages: first 21 + 1 states with the total angular momentum L = 0 
are filled, and only then the other 21 -f- 1 states. We find that when 
the former set of states are being filled, the least multiplet level 
corresponds to the value J = \ L — S |, while in filling the sub¬ 
sequent 21 -f 1 states the least level has J = L + S. This is known 
as Hund's second rule ; it will be substantiated in considering the 
fine structure of atomic levels. 

Let us also determine how many electrons in an atom can have 
a given principal quantum number n . Since l varies from zero to 
n — 1, and since for a given n and Z there can be 2 (2Z + 1) elec¬ 
trons, we find that in an atom Pauli’s exclusion principle permits 
a total of 

n-1 

2 2(2Z + l) = 2ra 2 (33.25) 

l 

electrons with the given n . 

The Thomas-Fermi Method. Fock’s method gives satisfactory 
quantitative results when applied to the problem on the motion of 
several electrons. But for a large number of electrons the equations, 
naturally, become so complex and involved that it is difficult to 
derive any general laws from them. For such cases there is a more 
approximate, but very general, method of approach to the many- 
electron problem based precisely on the fact that the number of 
electrons in the atom, Z, is great in comparison with unity. 

This approximate method was suggested independently by Thomas 
and Fermi on the basis of intuitive, but extremely graphic, reason¬ 
ing. Subsequently Dirac showed that the Thomas-Fermi approxima¬ 
tion could be developed from Fock’s equations by applying the quasi- 
classical approximation. The precision of the method, that is, 
the permissible error, is of the order Z _2/3 . 

Despite the fact that it is always preferable to present the stricter 
method of deduction, enabling an evaluation of the error, we shall 
take a graphic approach to the Thomas-Fermi equations, which 
offers a better understanding of their physical essence. Furthermore, 
even if the error in passing from Fock’s equations to the Thomas- 
Fermi equations can be seen, the degree of approximation of Fock’s 
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equations in comparison with the exact quantum mechanical equa¬ 
tions has to this day not been evaluated at all. All we know is that 
the numerical expression of the error is usually not great, but why 
this is so remains theoretically unexplained. 

We shall proceed from the equations of Section 28 for the possible 
number of particle states. From Eq. (28.23), the number of states 
of a quantum mechanical particle with linear momenta lying be¬ 
tween p x and p x + dp x , p y and p y + dp y , p z and p z + dp z , in a 
volume V is 


dN(P X , Py, Pz) 


V dp x dp y dp z 
(2nhfi 


Since an electron additionally possesses an internal degree of 
freedom (spin), the right-hand side of this equation should be 
multiplied by two. Furthermore, electrons are subject to Pauli’s 
exclusion principle whereby no two electrons can be found in each 
state with a given spin projection. 

Let there be a system of N electrons. What is the least possible 
value of their kinetic energy? 

It is obvious that, according to Pauli’s exclusion principle, they 
cannot all have zero kinetic energy, since they would then all have 
to be in a state with p x = p y = p z = 0, which is prohibited. Only 
two electrons with spin projections of opposite sign can have the 
same linear momentum (including zero). The next two electrons 
must have a momentum differing somewhat from zero. 

In the ground state for all the electrons as a whole each pair of 
electrons must occupy a vacant state lying as close as possible to the 
zero-momentum state and unoccupied by another pair. It is obvious 
that a sufficiently large number of electrons will occupy a sphere in 
momentum space centred at the origin of the coordinate system. 
The volume of this sphere is made up of separate cubes of volume 
(2jift) 3 /F, each cube containing two electrons. Any deviation from 
the spherical shape of the volume in a momentum space filled with 
electrons leads to an increase in the total energy, that is, a deviation 
from the ground state. 

Let the greatest electron energy within this sphere be E 0 . Then 
from Eq. (28.25) it is easy to relate E 0 to the total number of elec¬ 
trons in the sphere. Adding the factor 2, which takes account of 
spin, we obtain 


N 


2 1 / 2 m 3 / 2 

Jl 2 /i3 


E 0 

V j E m dE 


2 3l2 m V2 V 

3jiW 


e»3/2 

LLq 


The ratio N/V is essentially the electron density n, so that the 
relation between the boundary electron energy E 0 and the density is 

E 0 = 3 2/3 h 4/3 « 2 / 3 (33.26) 


29 * 
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Actually, this relation refers to the kinetic rather than the total 
energy of the electrons. The difference between the total energy and 
the kinetic energy is essential only when the motion occurs in a 
domain in which the potential energy varies. 

Let us consider the potential-energy curve in Figure 38. The 
kinetic energy of a particle in the domain 0 ^ x ^ a is plotted 
from U = 0, and in the domain x > a, from U = t/ 0 . (We assume 
that the domains are large enough to apply the quasi-classical ap¬ 
proximation for the given kinetic energy, that is, that they are 


U 

E n 


U n 


U x 


Figure 38 


substantially greater than the de Broglie wavelength. In this case 
the kinetic and potential energies have approximate meaning as 
separate quantities.) 

For such a potential-energy curve the least total energy is obtained 
when its maximum value E 0 -f- U is the same in both domains 
(analogous to the way a liquid in communicating vessels assumes 
the same level in both). If the maximum electron energy were greater 
in one domain than in the other, the total energy of the electrons 
could decrease on account of their passing to vacant places in the 
domain with lower maximum energy. But we are looking for the 
ground state of a system, in which the total energy can decrease 
no more. 

Thus, the condition of the electrons being in the state with the 
least total energy is 

E 0 + U = constant (33.27) 

This condition can also be applied in cases when the potential 
energy varies spatially not in jumps, as in Figure 38, but smoothly, 
as in the atom. It is only necessary for it not to vary too greatly 
over a distance equal to one de Broglie electron wavelength, in 
accordance with the general provision for the applicability of the 
quasi-classical approximation (see Sec. 31). But when the number of 
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electrons is large their maximum kinetic energy is great, hence the de 
Broglie wavelength is small. Therefore, the condition for the applica¬ 
tion of this method is that the number of electrons in an atom be 
sufficiently large in comparison with unity. 

The potential energy distribution in the atom looks approxi¬ 
mately like the curve presented in Figure 39. The potential energy is 
negative everywhere because it is gauged to zero at infinity. 

The boundary energy of the electrons should not be positive 
anywhere, because with positive energy they could recede from the 



atom into infinity. Hence, the boundary energy can be either negative 
or zero. We shall show that it cannot be negative, that is, it is zero. 
If, for example, the boundary energy were defined by the dashed line 
in Figure 39, then the electron density would have to vanish at the 
point r = r 0 : from Eq. (33.26) at the point where the kinetic energy 
is zero the density also vanishes. 

If we assume that at some point r = r 0 the electron density is zero, 
then we must accept that all the electrons are to be found at r ^ r 0 , 
so that the total charge of the atom, both the positive and the nega¬ 
tive, is concentrated in a sphere of radius r = r 0 . But then, by the 
Gauss law, the electric field must become zero at point r = r 0 , 
since the charge distribution is taken to be spherically symmetrical. 
But | E | = —dq)/dr, and from Figure 39 it is apparent that the 
derivative of the potential at this point is not zero. Consequently, 
the only possibility is that the whole charge is concentrated in a 
domain on whose boundary the derivative of the potential vanishes. 

In Figure 39, the derivative of the potential tends to zero at 
r oo. It follows that the boundary energy of the electrons is equal 
to zero, as asserted. However, it is possible to visualize a case when 
the potential energy curve approaches the abscissa at a horizontal 
tangent for finite r. This also yields zero for the boundary energy. 
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It will be shown further on, however, that this does not occur. In any 
case, we shall proceed from the consideration that the boundary 
energy is equal to zero. The constant in Eq. (33.27) is precisely that 
boundary energy. 

The potential energy of an electron is U = — ey, while the kinetic 
energy must be expressed in accordance with (33.26). We then obtain 
the basic relationship for the Thomas-Fermi method between the 
potential in the atom and the electron density at a given point: 

3 2/3 n*/» h 2 (2m)~ i n 2 > 3 — eq> = 0 (33.28) 

We obtain the second relation between the potential and the 
density from Eq. (16.7). We put the opposite sign in the right-hand 
side since the charge of an electron is negative: 

(33.29) 


Solving Eq. (33.28) for the electron density and substituting it 
into (33.29), we obtain an equation describing the electron distribu¬ 
tion in the atom: 


1 d 2 d <p 
r a dr dr 


2 m m v2 

3ji h * 


g5/2 q)3/2 


(33.30) 


We transform this equation in the same way as (20.6). For this 
we substitute (p in the form 



(33.31) 


The function *ip is dimensionless, since Ze/r has the dimensions of 
potential. In the immediate vicinity of the nucleus, (p is determined 
only by the nucleus, since its potential tends to infinity like Ze/r, 
while the potential of a spatially distributed electron charge remains 
finite. Hence, close to the nucleus the boundary condition consists 
in that ip (0) = 1. 

At large distances from the nucleus its charge is completely 
screened by the charge of the electrons. Therefore the potential must 
tend to zero faster than 1/r. Hence ip (oo) = 0. 

Substituting (33.31) into (33.30), we find the equation for \p: 


rf2\p _ 2 7/2 v l/2 m 3/2 3 Tp 3/2 
dr 2 “ 3ji L h& e yl/2. 


(33.32) 


It is convenient to eliminate the dimensional factor in the right- 
hand side. For that we have to introduce a new unit of length similar 
to the atomic unit (see (29.30)): 

_ (3ji) 2/3 1 h? 

r ”” 2 7/3 Z 1/3 me% 


X 


(33.33) 
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This unit differs from the atomic unit by the factor 0.889/Z 1/3 . 
Introduction of the dimensionless variable x reduces (33.30) to the 
standard form 


_ T|3 3/2 


(33.34) 


of the Thomas-Fermi equation . 

Now neither the equation nor the boundary conditions for it, that 
is \J)(0) = 1 and \|)(oo) = 0, involve the atomic number. It is 
therefore sufficient to integrate (33.34) once for all atoms. But the 
equation can, of course, be applied only to atoms with large and 
medium atomic numbers. 

Going back to the dimensional radius r, we find that the function 

(x) yields the potential distribution for each Z: 

<p =—^r 'I 3 (1.125Z 173 -p- r) (33.35) 

If the distance from the nucleus is expressed in terms of x, the 
electron density distribution appears the same for all atoms. This, 
naturally, is a property of Eq. (33.34), and not of real atoms. But 
even such a highly approximate treatment makes it possible to 
draw important conclusions concerning the atom as a whole. 

Since for the same values of x the value of \|)(:r) is the same for all 
atoms, the corresponding values of r for different atoms are inversely 
proportional to Z 1 / 3 . Hence, in heavy atoms the bulk of the elec¬ 
trons is concentrated closer to the nucleus than in lighter atoms. 

We shall now show that the boundary condition \J) = 0 can be 
imposed in a noncontradictory manner only for x = oo. It was 
already mentioned that the potential curve must approach the 
point x$ where ^(x) vanishes only with a horizontal tangent. Con¬ 
sequently, close to that point the expansion of \J)(x) commences with 
such a term: 


r| )(x) = a(x — x 0 ) 2+k 

where A: is a positive number. Substituting this expansion into 
Eq. (33.34), we have 

3/2 

(2 + k) (1 + k) a (z - x 0 ) h = -2™- (x - x 0 ) 3+3h/2 

whence it follows that k = —6, contrary to the requirement that 
k ^ 0. Only an asymptotic osculation with the abscissa does not 
lead to a contradiction. 


The Appearance in Atoms of Electrons with a Given Value of Z. 
In developing the Thomas-Fermi equation we proceeded from a 
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momentum distribution of electrons. However, the question can also 
be posed of their distribution according to other integrals of the 
motion. Since in the atom, that is, in a central field, angular mo¬ 
mentum is conserved, it is natural to seek an electron distribution 
according to angular momenta. 

Let us first see what the angular momentum distribution of the 
electrons in an atom is like. The boundary momentum of the elec¬ 
trons is proportional to the square root of the boundary kinetic 
energy E 0 , since from (33.26) the electrons having the largest momen¬ 
ta lie closer to the nucleus. But the angular momentum is propor¬ 
tional to the product of the linear momentum multiplied by the 
distance from the nucleus, hence close to the nucleus an electron’s 
angular momentum is small. At greater distances from the nucleus 
the electron density decreases, and accordingly the boundary mo¬ 
mentum decreases too. This again leads to a decrease in angular 
momentum. It follows, then, that the angular momentum attains 
its maximum value somewhere at median distances from the nucleus. 
This maximum angular momentum is the greater the higher the 
electron density. That is why in heavy atoms, where electron density 
is great, angular momentum attains great values. 

To determine the maximum values of angular momentum that 
are possible for a given Z, we shall proceed from the classical ex¬ 
pression for the energy of a particle in a central field: 


p Pr . M* Ze *\|) 

^ 2m + 2mr a r 


(33.36) 


In accordance with the basic assumption (33.28), the boundary 
energy E should be put equal to zero. Then for the radial component 
of the linear momentum we obtain the following formula: 

p r =:(. 2m ^ -ig-) 1/2 (33.37) 

In place of M 2 we can put h 2 l (l + 1). But since the present theory 
corresponds wholly to the quasi-classical approximation, a rather 
better result is obtained for the quasi-classical eigenvalue of M 2 . 
It can be shown (though we shall not do this) that the quasi-classical 
eigenvalue of M 2 is h 2 (l + 1/2) 2 . Note that (l + 1/2) 2 differs from 
l (l + 1) by only 1/4. 

We now take hlr outside the parentheses in (33.37) and express 
the remaining expression in terms of the dimensionless quantity x: 

Pr\ = y( 1.778 Z m xty — ( l + y ) 2 ) 1/2 (33.38) 

For p T to be a real quantity, -the radicand must remain real within 
a certain interval of values of x . But since an|) = 0 at x = 0 and 
at x = oo, the interval is finite and includes the maximum of nf. 
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The maximum is equal to 0.488, hence the whole interval in which p r 
is a real quantity contracts into a point at a value of Z such that 

1.778 x 0.488Z 273 = + (33.39) 

Then the curve y = 1.778Z 2 / 3 mp touches the horizontal line y = 
= (l + 1/2) 2 . It follows that a given value of l in an atom can appear 
for Z satisfying the condition 

Z = 0.155 (21 + l) 2 (33.40) 

According to this equation, electrons having 1 = 2 appear at 
Z = 19, and those having l = 3, at Z = 53. A better agreement 
with reality is obtained if in the latter formula we take 0.17 instead 
of 0.155. 

The whole of the foregoing calculation is possible only because 
account is taken of the screening of the nuclear field by the electrons, 
with which the appearance of the maximum of the function nj? is 
associated. Nothing of the kind could occur in a pure Coulomb field. 
This points to the fact that, as the atomic number increases, so does 
the dependence of electron energy on the orbital quantum number. 

When the field differs substantially from a Coulomb field, the 
dependence of the energy upon l becomes so strong that an increase 
in the principal quantum number n, with a simultaneous reduction 
in Z, leads to a slower increase in energy than an increase in n for 
a given Z. 

The reason is that for large Z the angular momentum arm is large, 
that is, the electron is far from the nucleus. But in an atom the 
dependence of an electron’s potential energy upon r does not obey 
the Coulomb law: it decreases much faster, owing to screening. 
That is why at large distances from the nucleus the electron is as it 
were ejected from the potential well by a centrifugal force, which 
leads to a comparatively large rise in the energy level. 

The increase in energy in a transfer from the energy level E (n, l) 
to E (n + 1, 0) turns out to be less than in a transfer from E ( n , Z) 
to E (n, l + 1). In a Coulomb field the energy depends only upon n, 
and a transfer from E (n, l) to E (n, l + 1) does not change the 
energy at all, while a transfer from E (n, l) to E (n + 1, 0) leads 
to a rise in level. The inequality 2? (rc + 1, 0) — E (n, l) > 
> E (n, l + 1) — E (n, l) also occurs in a weak deflection of the 
field from the Coulomb field. 

The Mendeleyev Periodic Table. Let us now consider, in general 
terms, the filling of the spaces in the Mendeleyev Periodic Table. 
With few electrons in the atoms, the dependence of an electron’s 
energy on the quantum numbers does not differ substantially from 
that found in a purely Coulomb field, so that the ordering of energy 
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levels is determined by the principal quantum number n. Accordingly, 
filling of the atom’s electron shells takes place according to increas¬ 
ing values of n. 

As can be seen from Eq. (33.25), the shell with n = 1 can contain 
two electrons. Hydrogen has one electron, its state is Is; in helium 
the shell is filled, the state being (Is) 21 S^. The electron shell of the 
ground state of a helium atom is so stable that if any other atom 
approaches it the total energy can only increase, so that repulsive 
forces appear. Helium is chemically inert. The interaction forces 
between helium atoms are small as a result of the symmetry and 
stability of their electron shells. Therefore, helium is liquefied at 
an extremely low temperature. 11 

After helium, the filling of the shell with n = 2 begins. The first 
electron of this shell, that is, a 25-electron, appears in lithium. The 
two inner ls-electrons occurring in the helium configuration form 
a closed shell (the K shell) and strongly screen the nuclear charge; 
consequently, the outer electron is weakly bound. Such is the alkali- 
metal electron configuration in the case of lithium, and analogous 
electron configurations subsequently result each time (Na, K, Rb, 
Cs) from the addition of an 5-electron to a nucleus surrounded by 
a noble-gas electron shell. 

Lithium is followed by beryllium , the atom of which has two 
25-electrons, the configuration being (I 5) 2 (25) 2 i S§. Although the 
(25) 2 shell (the L\ shell) is filled, its properties are quite unlike the 
properties of the noble gas helium with the filled (I 5) 2 shell. The 
reason for this is that the field acting on an electron in the light 
element beryllium still approximates the Coulomb field sufficiently, 
and the energy of the 25 state is therefore close to the energy of the 
2 p state: it is but slightly dependent upon the orbital quantum 
number. A small energy is needed for the transition of an electron 
from the I 5 shell to a 25 or 2 p shell. Because of this the electron 
configuration of beryllium is unstable with respect to perturba¬ 
tions. The small energy that evolves in the joining of other atoms 
is sufficient to cause the restructuring of the 2 p shell required to form 
a stable system of bound atoms. 

After beryllium, filling of the 2 p shell (the L n shell) begins, 
until it is completely filled for neon . Neon is preceded by fluorine , 
which lacks one electron to completely fill the Ln shell. The energy 
required to add an electron to the (2p) 5 shell of fluorine, completing 
it as a neon shell, is large. This explains the chemical activity of 


11 The condensation of helium into a liquid at low temperatures is due 
to the so-called Van der Waals* forces, which arise out of the mutual electro¬ 
static polarization of approaching atoms. These forces act at larger distances 
than the forces of chemical affinity, and are very small compared with them. 
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fluorine and the other halogens, which are similarly situated with 
respect to the noble gases. 

A shell with n = 2 may contain eight electrons. The elements 
in which the electrons occupy this shell form the second period of 
the Mendeleyev Periodic Table (H and He form the first period). 
Then the shell with n — 3 is filled, but initially only its first two 
subshells, 3s and 3 p (Mi and Mn). The structure of the outer elec¬ 
tron shells in elements of the third period is similar to that of the 
shells in elements of the second period. 

The chemical properties of atoms are in the main determined by 
their outer electron shells. This explains the similarity of chemical 
properties on which the periodic law is based. 

In argon , the 3 p shell is filled, thus completing one more period 
of eight elements in the Periodic Table. Argon owes its noble gas 
configuration to the fact that the 3 p state, on the one hand, and the 
3d and 4 s states, on the other, differ considerably in energy. One 
could say that in argon there takes place an equalization of the 
effects of the principal quantum number and the orbital quantum 
number on the energy level: their effect is approximately the same 
and still strong enough for the element to have the stable electron 
shell of a noble gas. But unlike helium, argon is nevertheless capable 
of forming chemical compounds. 

In investigating the states of shells which require less than half 
the total number of electrons to be filled (that is, less than 21 + 1), 
we may consider that the unoccupied states (“holes”) behave like 
electrons. For example, if an np shell lacks two electrons to six, 
we may combine the states of the two “holes” in the same way as we 
combine the states of two rcp-electrons. The result is always correct, 
but here, to find the total angular momentum of such a system of 
two “holes”, we apply Hand’s second rule, that is, take / = L + S. 
By adding the spins and orbital angular momenta, with due account 
for Pauli’s exclusion principle, we can readily verify that the four 
electrons in the np shell are equivalent to the two “holes” in it (see 
Exercise 2). 

The concept of a “hole” in a system of occupied states and the 
equivalence of a “hole” and a particle has proved to be extremely 
useful in many branches of physics where many-electron systems 
are studied. 

Let us now give, in one table, the scheme for building up the 
first eighteen places in the Periodic Table. It also gives the number 
of electrons having given quantum numbers. 

After argon, the stronger effect of the orbital quantum number 
than the principal quantum number on the energy comes into play. 
In other words, from the energy point of view filling of the 4 s shell 
turns out to be preferable to filling of the 3d shell. The new period 
begins with the alkali metal potassium. We note an empirical regu- 
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larity: filling occurs for the same n + l. The sum n + l is the same 
for the 3 p and As shells and is equal to A f while it is already greater 
by unity in the 3d shell. The Ap shell is filled after the 3d shell, 
with the same value of the sum n + l = 5, and then the 5s shell. 
It is seen that this rule is observed later on, too; the filling of the 
shells with the same sum n + l proceeding in order of increasing n. 
But there are certain deviations from this rule during the filling 
of the d and / shells. 

In the shells with n = 1, 2, 3 there are altogether 2 X 1 2 + 
+ 2 x 2 2 + 2 X 3 2 = 2 + 8+ 18 = 28 electrons. There are fur¬ 
ther eight electrons in the 4s and Ap states, and another two elec¬ 
trons in the 5s state. The 5s state is followed by electrons with n + 
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+ l = 6 , where we begin with the least n, that is, with 4 d. There 
are 2(4 + 1) = 10 more of these electrons. The 4d-electrons are 
followed by 5p-electrons, of which there are six, and then by the 
same rule we get the 65 state. 

The next value of n + l = 7, the least being n = 4. Hence, begin¬ 
ning with the 57th place (in actuality, with the 58th place) the 4/ 
shell can begin to fill acquiring at once two 4/-electrons. 

This agrees well with the result obtained on the basis of the 
Thomas-Fermi method for the appearance of electrons with l = 3. 
We saw that electrons with the maximum angular momentum val¬ 
ues appear first in the middle of the atom, at the value of the dimen¬ 
sionless variable x = 0.488 (this, of course, is an extremely rough 
estimate). 

The same can be said in considering the motion of electrons in 
a central field decreasing not according to the Coulomb law, but 
faster, due to screening of the field by other electrons. Specifically, 
the Thomas-Fermi potential decreases according to the law Zelr\ J?(:r), 
approximately as r -4 , that is, faster than the centrifugal energy 
h 2 l ( l + l)/(2mr 2 ). If we add the potential energy of the electron, 
calculated with due account of screening, and the centrifugal energy, 
we find that in the d and / shells the minimum of the effective total 
potential energy lies at the middle of the atom (cf. Sec. 5). 

The curve U M for d - and /-electrons at large r goes higher than 
for s - and p-electrons, and it turns out that the effective potential 
well—the minimum on the curve U M —lies closer to the nucleus 
than the boundaries of the s - and p-electron shells. Thus, filling 
of the d and /shells occurs as it were within the atom. But the chem¬ 
ical properties of atoms depend primarily on the outer electrons, 
the states of which change but slightly in the filling of the / shell. 
This is how the group of 2 (2 X 3 + 1) = 14 chemically similar 
elements, known as the rare-earth elements , is formed. 

In the filling of the 3 d shell, interaction with the outer electrons 
is stronger; as a consequence, instead of a group of similar elements 
there appears a series of elements with irregular variations of chem¬ 
ical properties. A “contest” as it were takes place between the 3d 
shell and the outer shell for the energetically most preferable state: 
for example, V 23 has three d-electrons and two 5-electrons, the next 
element, Gr 24 , has five d-electrons and one 5-electron, while Mn 25 
also has five d-electrons, but two 5-electrons. 

Filling of the 5/ shell takes place, starting with thorium, for 
a whole group of elements similar to the rare earths. Most of these 
are artificially produced transuranium elements. 

Nuclear Shells. The electron configurations of the noble gases 
correspond to an especially large binding energy of the electrons 
in the atom. More work must be done to remove one electron from 
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the atom of a noble gas than from the atom of any other element. 
When one more electron is added to the electron shell of the noble- 
gas type, as in alkali metals, it is very weakly bound. 

The noble gases occupy very specific places in the Mendeleyev 
Periodic Table: 2, 10, 18, 36, etc.; this is explained by the model of 
filling electron shells, which are in turn formed according to the 
quantum numbers of individual electrons. But for the concept of 
quantum numbers of a separate electron to be meaningful we must 
assume that the electron is subjected to the action of the self-con¬ 
sistent field of all the other electrons. 

As was pointed out before, the self-consistent field method as yet 
remains without any strict substantiation in atomic physics. It is 
usually said that the Coulomb electrostatic forces act at a distance 
and therefore each electron is indeed subject to the influence of all 
the others. 

It has been proved experimentally that atomic nuclei are also 
characterized by especially stable states for each “variety” of specific 
numbers of nucleons (i.e., neutrons and protons) similar to the 
ground state of noble gases. These numbers are 2, 8, 20, 50, 82, 
and 126. But it is well known that, unlike Coulomb forces, nuclear 
forces act at short range and therefore the argument in favour of the 
applicability of the self-consistent field concept to the atom is ir¬ 
relevant with respect to the nucleus. 

But since the numbers listed above nevertheless do manifest 
themselves in very many experimentally observable properties 
of nuclei, the tongue-in-cheek name of “magic numbers” was sug¬ 
gested to stress their incompatibility with theoretical expectation. 
Apparently, the self-consistent field concept is nevertheless appli¬ 
cable to the nucleus. Without going into the possible reasons for 
this, we shall go through the reasoning making it possible to deduce 
these numbers from a certain simple theoretical model. 

It was pointed out in Section 28 that the forces acting between 
the particles in a nucleus can be described with the help of the effec¬ 
tive potential well. Suppose that a certain self-consistent field 
does nevertheless exist in the nucleus, and it causes each nucleon 
to move in such a well. The problem of particle motion in a rectan¬ 
gular well was solved in Section 28, and it was shown that it refers 
not to the one-dimensional case but, in effect, to the three-dimen¬ 
sional case, only as applied to an orbital angular momentum eigen¬ 
value equal to zero. 

Such a problem can also be solved for an arbitrary orbital angular 
momentum of a particle. In order to approximate the solution some¬ 
what closer to reality, we take not a strictly rectangular well, as 
in Figure 31, but one with slightly rounded edges. Then the energy 
levels are arranged in groups. Each level in the nucleus is conven¬ 
tionally denoted by the radial, rather than the principal, quantum 
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number, that is by the number of zeros of the radial function, and 
by the orbital quantum number. These groups have been found to be: 
is , 1 p; Id, 2s; 1/, 2 p; 1 g, 2d, 3s; ih , 2/, 3p; 1Z, 2g, 3d, 4s. (We re¬ 
call that the letters g, ft, Z denote the respective angular momenta 
l = 4, 5, 6. There may be 2 (21 + 1) particles in a state with each Z.) 

Therefore, if we try to correlate the listed groups of separate energy 
states with the respective shells, we obtain the following numbers 
of occupied states: 2; 2 + 6 = 8; 8+10 + 2 = 20; 20 + 14 + 
+ 6 = 40; 40 + 18 + 10 + 2 = 70; 70 + 22 + 14 + 6 = 112. 

They agree with the magic numbers only up to 20. 

M. G. Mayer and H. E. Suess explained this discrepancy by the 
fact that the strong spin-orbit coupling of nucleons was not taken 
into account in developing the energy levels. The existence of this 
coupling is well known from nuclear physics, but we shall not go 
into it. Mayer and Suess postulated that for large Z’s the state with 
angular momentum j = l + 1/2 differs in energy so greatly from 
the state with / = l — 1/2 that it falls within the states of the pre¬ 
ceding group. In that case the states should be grouped as follows 
(starting with the fourth group): 1/, 2p, lg 9 / 2 ; lg 7 / 2 , 2d, 3s, 1 ftn/ 2 ; 
lft 9 / 2 , 2/, 3 p, lZi 3 / 2 . Since in states with / = l ± 1/2 the spin is 
rigidly coupled with the orbital angular momentum, they are (2/ + 1)- 
fold degenerate. Therefore, after number 20, corresponding to the 
filling of the first three groups, the next occupied group occurs for 
20 + 14 + 6 + 10 = 50; then for 50 + 8 + 10 + 2 + 12 = 82; 
and for 82 + 10 + 14 + 6 + 14 = 126. This corresponds exactly 
to the magic numbers. 12 

The suggested explanation leads in turn to certain predictions, 
namely that close to the magic numbers one should expect nuclei 
with large angular momenta: / = 9/2, 11/2, . . ., which is in fact 
the case. 

The Ortho- and Para-States of Two Electrons. We shall now show 
how to develop a spin wave function in accordance with exchange- 
symmetry requirements. We shall assume that the total wave func¬ 
tion separates into a product of the coordinate and spin parts and 
consider the spin part separately. Since the whole product is anti¬ 
symmetric, one of its factors must be symmetric and the other anti¬ 
symmetric. As mentioned before, this simple conclusion refers 
only to a two-electron system. 

Since, in atomic units, the spin of each electron is equal to 1/2, 
the total spin of a two-electron system may be either zero or unity. 
These states have special names: the state with unity spin is called 
the ortho-state , the one with zero spin is the para-state. 


12 Close to the magic numbers is the number 28, which is obtained if the 
state fa /2 is separated from the fourth group. 



464 


Fundamental laws 


Neglecting the magnetic interaction between the spins, we can 
assume that the spin wave function of two electrons can be devel¬ 
oped from the products of the spin functions taken separately, in 
the same way as was done for the coordinate wave function (33.17): 

X fan Su <t 2 , Sjs) = x fall Si) x faz. s 2 ) (33.41) 

Now the symmetry requirements must be satisfied. If o x = a 2 , 
the product (33.41) is intrinsically symmetric. If o x a a , then 
either a symmetric or antisymmetric wave function can be formed, 
the same as in (33.17). As a result we obtain three symmetric wave 
functions: 

x(li *i)x(l, *t) 

-y% lX(!i «i)x(-li s 2 ) + x(li s 2 )x( —1. * 1 )] 

X( — I. *i)x( — 1. *«) (33.42a) 

and one antisymmetric: 

-j 7 =-[x(l. *i)x( —1. **) — x(l, s 2 )%( — 1, s 4 )] (33.426) 

The factor l/]^2 is introduced for normalization. 

To the functions (33.42a) there correspond three projections of the 
total spin: 1, 0, and —1, while only one zero projection corresponds 
to the function (33.426). The value of the spin projection, that is, 
0, 1, or —1, depends on the choice of the z axis. But the symmetry 
or antisymmetry of a wave function is an intrinsic property inde¬ 
pendent of the choice of coordinate axes. Therefore, all three wave 
functions (33.42a) must be considered as belonging to one and the 
same value of total spin, equal to 1, but to three different projec¬ 
tions; with state (33.426) belonging to the zero total spin value. 
The total spin, like the wave function symmetry, does not depend 
upon the choice of coordinate axes. 

The division of spin states into ortho- and para-states holds for 
all systems comprising two identical particles, and for all spin 
values. The state with a symmetric spin wave function is the ortho¬ 
state, that with the antisymmetrical wave function is the para-state. 
But at spin values greater than 1/2 there is no one-to-one correspon¬ 
dence between the para- and ortho-states and the total spin. At 1/2 
spin, though, the spin function automatically belongs to either the 
ortho- or the para-state, provided we require that it be the eigen¬ 
function of the total spin operator. In that case it is no longer es¬ 
sential that the particles be identical, provided each possesses 1/2 
spin. Thus, the ephemeral electron-positron (positive electron) 
formation, that is, a system of two different particles, may be in 
the ortho- or para-state, depending upon the value of the total spin. 
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The self-consistent field equations involve the expression for the 
exchange energy (see (33.22a) and (33.226)) with two signs, depend¬ 
ing upon the symmetry of the spatial wave function. But since 
the symmetry of this function is the reverse of the symmetry of the 
spin wave function, we can say that the sign of the exchange energy 
is associated with the state of the particles: in the para-state, 

and “—” in the ortho-state. The para- and ortho-states, in turn, 
correspond to the different values of the total spin. Therefore, the 
sign of the exchange energy may be related to the eigenvalue of the 
total spin of two electrons. 

In Exercise 2, Section 30, it was shown that the eigenvalue of the 
operator (cvOg) is equal to —3 for antiparallel spins and to 1 for 
parallel spins. Hence, if we introduce the operator (1 + )/2, 

its eigenvalues are +1 in the ortho-state and —1 in the para-state. 
And since the symmetry of the spatial wave function is the opposite 
of the symmetry of the spin function, we can simply introduce into 
the Hamiltonian of the system, written in the self-consistent field 
approximation, the exchange energy multiplied by the quantity 
—(1 + OxOg)/2. Then the correct sign of the exchange energy will 
be assured in the equations automatically. 

Thus, we have as it were the effective interaction between electron 
spins, which, as was pointed out, is much larger than the magnetic 
interaction between them. 

The Spin-Orbit Interaction of a Separate Electron. Let us develop 
an operator describing the interaction of an electron’s spin with 
its orbital angular momentum. Obviously, such an operator can be 
determined only for a self-consistent field since, being integrals 
of the motion, the angular momenta of separate electrons cannot 
be stated otherwise than in the assumption that all the other elec¬ 
trons create a static, central-symmetrical field acting on the given 
electron. 

A strict development of a spin-orbit interaction operator is pos¬ 
sible only on the basis of the relativistic electron wave equation 
(Sec. 37). Here we shall restrict ourselves to a semi-intuitive gra¬ 
phic proof. 

Let us pass to a reference frame in which the electron is at rest. 
In this frame it is subject to the action of both an electric and a mag¬ 
netic field, while in the system in which the nucleus is at rest there 
is only an electric field. But we cannot make the transfer directly 
according to the Lorentz transformations for the field components 
(15.1a) and (15.16), because the frame in which the nucleus is at 
rest is noninertial. We shall therefore transform the reference frame 
gradually, by means of infinitesimal “steps”. 

Since the noninertiality is due to the interaction of the electron 
and the nucleus, we may assume that the interaction itself is effected 

SO— 0452 
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in infinitesimal “portions” by the elementary electric charge’s varia¬ 
tion from eX to e (X + dX). Since the Bohr magneton is itself pro¬ 
portional to the charge, its value here is X$. 

In an increase of the magnetic field by rfH the energy operator 
of the magnetic field acquires an additional term 

dV= -M-(a.dH) 

At low relative velocities the Lorentz transformations yield 
dU--'JvXdE-—± r ,XdE 

Since the self-consistent field is a central field, the electric field 
is equal to 

E=-gradq>=—J-g- 


and dE = E dX. Substituting this into the formula for the magnetic 
energy dV, we find 


dV = 


1 

2m c 




Replacing r X p by M, and the magnetic moment by eh/(mc), we 
obtain 


dV-. 


2m 2 c 2 




Integrating with respect to X from 0 to 1, that is, passing to the 
total value of the interaction, we find the required spin-orbit 
interaction Hamiltonian 


V 


eh 

4m 2 c 2 


(a-M)f- 


dr 


(33.43) 


Let us evaluate the order of magnitude of the coefficient in 
Eq. (33.43). For that we note that if we take the Thomas-Fermi 
potential (33.31), then 


d(p 
• dr 


Ze 

r2 




dip 


The factor \|) — x (dtyldx) is of the order of unity and will in future 
be discarded. But since the Thomas-Fermi variable, x , is itself 
of the order of unity, we obtain r ~ h 2 /(Z 1/3 me 2 ). If we express 
the orbital angular momentum operator M in terms of the dimen¬ 
sionless operator 1, that is M = hi, and a in terms of the double 
dimensionless spin operator, 2s, and substitute into Eq. (33.43), 
the numerical factor of (s* 1) turns out to have the following order 
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of magnitude: 

— Hr)’*# < 33 - 44 > 

Here, e 2 l(hc) is a dimensionless quantity equal to 1/137, so that 
Ze 2 l(hc) attains 0.6 at Z « 90; me*/h 2 is the atomic unit of energy, 
equal to 27 eV (1 hartree). 

The Interaction of L and S. Thus, the operator of the spin-orbit 
interaction of a separate electron may be written in the form 

V = a (l.s) (33.45) 

The total spin-orbit interaction Hamiltonian is obtained by sum¬ 
ming V for all the electrons in a certain shell with total spin and 
orbital angular momentum. This is done in the following manner. 
We assume the total spin S of the electrons to be fixed, and we aver¬ 
age over the spin involved in each term of the spin angular momen¬ 
tum of the separate electrons. There then remains only the project 
tion of the spin of a separate electron on the total spin: 

(s) = bS 

The averaging was done over the spin states for the given total 
spin considered as an operator. Similarly, the orbital angular mo¬ 
menta are averaged for the given total orbital angular momentum L, 
so that as a result we obtain the spin-orbit interaction Hamiltonian 

V S0 = A( L.S) (33.46) 

Let us find the eigenvalues of this operator. For that we write 
J 2 = (L + S) 2 = L 2 + S 2 + 2 (L • S) 

But L 2 = L (L + 1), S 2 = S (S + 1), and J 2 = J (J + 1), whence 
(L.S) = [/ (/ +1) - L {L +1) - S (S +1)]/2 (33.47) 

Thus, the spin-orbit interaction Hamiltonian is diagonal in states 
with a specified value of the total angular momentum J. For the 
given L and S there are altogether 2L + 1 or 2S + 1 different 
states with a given /, which are known together as a multiplet. 
Since L and S are the same for all the multiplet components, we 
find that the energy of a multiplet component with the given value 
of J is determined only by the term / (/ + 1). 

Now we are in a position to explain Hund’s second rule with 
respect to the value of J corresponding to the least energy in the 
multiplet. 

Let there be a certain shell ( nl) v in which less than half the places 
are occupied ( p is the number of electrons in the shell, p < 21 + 1). 
Then by Hund’s first rule, all the spins line up parallel to each 

other, and s = S/p. The constant a in (33.45) is positive. (This 

30 * 
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derives from (33.43), taking into account that dyldr < 0 and that 
the electron charge is negative.) But then it can be seen that the 
constant A in Eq. (33.46) is also positive and equal to alp. Hence, 
from (33.47), to the least energy of the multiplet there corresponds 
the least total angular momentum J = | L — S |. 

If the shell is more than half-filled, it is more convenient to ex¬ 
amine the “holes” rather than the electrons. The total spin and orbi¬ 
tal angular momenta of a fully occupied shell are zero, so that there 
is no splitting. Each “hole” signifies the removal of one electron 
from this occupied state, that is, it has negative energy with respect 
to the zero level. If there are several “holes”, their spins are parallel. 
Therefore the constant A is now negative: it refers to “holes”. To the 
least energy in the multiplet there corresponds the largest total 
angular momentum J = L + S. Thus, Hund’s second rule is vali¬ 
dated. When A > 0, a multiplet is said to be normal , when A «< 0, 
it is inverted. The multiplet splitting itself is called the iine structure 
of the level with given n and l. 

Levels with different L and S are spaced at distances of the order 
of one or several electron volts (see the example of nitrogen). This 
order of magnitude is explained, as we have seen, by the electrostatic 
interaction between electrons, which does not involve c in the deno¬ 
minator. Multiplet splitting is of a magnetic nature and is ac¬ 
cordingly smaller. Hence the name “fine structure”. 

The discourse above refers to light atoms: as is apparent from 
(33.44), in the case of heavy atoms multiplet splitting is of the 
same order as the electrostatic interaction between electrons. There¬ 
fore in heavy atoms another type of coupling than in light atoms is 
frequently observed: instead of the orbital and spin angular momenta 
of all the electrons combining into total L and S , the Vs and s's of 
separate electrons combine to form their total angular momentum, /. 
After that the total angular momenta of individual electrons com¬ 
bine. This is known as j-j coupling. In nuclei, as we have seen, only 
7-7 coupling occurs. 

The Atom in an External Magnetic Field (the Zeeman Effect). 

In examining the behaviour of a system of charges placed in an 
external magnetic field, a useful point of departure is the concept 
of Larmor’s magnetic moment precession about a field (see Sec. 17). 
In such precession only the angular momentum component along 
the field is conserved, the two perpendicular components averaged 
over the precession motion being zero. 

There is an analogous situation in quantum mechanics, with the 
only difference that angular momentum projections perpendicular 
to the field do not exist as physical entities. We thus have a simple 
correspondence between the integrals of the motion in classical and 
quantum mechanics. Such a corresponding quantity is the projec- 
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tion of the angular momentum on the magnetic field; it can be called 
a quantum integral of the motion. 

An external magnetic field imposed upon an atom disturbs its 
state in some specific way. Let us write the perturbing Hamiltonian 
for the case of normal coupling in the atom. Then the orbital mo¬ 
ments of the separate electrons combine into a total orbital angular 
momentum L, which in turn produces a magnetic moment: 

^ orb== '2^' L 

The spin magnetic moments also combine into a total spin mo¬ 
ment: 



(Here, the 2 does not appear in the denominator owing to the magne¬ 
tic spin anomaly: see Section 30.) 

The total magnetic moment of the atom is 

|i = |iord + ^sp =* 2 ^- (L + 2S) (33.48) 

It follows that magnetic moments combine according to a dif¬ 
ferent law than angular momenta, J = L + S. The magnetic mo¬ 
ment of an atom with nonzero L and S is not proportional to its 
angular momentum. 

The perturbing Hamiltonian due to the magnetic field is thus 
equal to 

V = - (|i • H)| = -IJL (L + 2S) = pH (J + S) (33.49) 

Here, |J = ehl(2mc) is the Bohr magneton. The sign before 
it is due to the fact that the electron’s charge is negative. We recall 
that in Eq. (17.33) the magnetic energy was defined precisely as 
a correction to the Hamiltonian, that is, to the energy expressed 
in terms of the momenta. That is why in quantum theory it is directly 
interpreted as an operator. 

In the subsequent reasoning it will be more useful to apply the 
so-called vector model of the atom, which makes the conclusions 
extremely graphic. The idea of this model is that the quantities 
/, L and S in Eq. (33.47) correspond to the three sides of a triangle. 
To each multiplet level corresponds its own triangle. In the limiting 
cases of / = L + S and J = | L — S |, the triangle degenerates 
into a straight line. 

Suppose, now, that a constant, homogeneous magnetic field is 
applied to the atom, and it acts on the atom’s magnetic moment 
according to Eq. (33.49). Two limiting cases are possible in which 
the vector model offers a very simple picture of the state of the atom 
due to the effect of the field. 
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(i) Weak fields. Let the external fields be very small in comparison 
with the effective internal “field’’ in the atom which causes the whole 
triangle JLS to rotate about side J. More strictly speaking, the 
additional energy of the atom due to such a weak field is very small 
in comparison with the distance between the individual components 
of the multiplet (both definitions mean basically the same thing). 

In Section 32 it was shown that a perturbation may be considered 
weak if the displacement of levels due to it is small in comparison 
with the distances between the levels. In the present case this refers 
to the fine structure levels. Furthermore, the correction to the value 
of the energy level is obtained by averaging the perturbation energy 
over the undisturbed state. But we have said that quantum mechan¬ 
ical averaging over a state can be represented as averaging over 
precessional motion. This is the mode of expression in using the 
vector model. Of course, the model can offer no more than follows 
from the general propositions of quantum mechanics. In particular, 
the precession should not be imagined as a real rotation of the triangle; 
it is only an indication of how to perform the averaging correctly. 

The state of the atom with angular momentum J is (2 J -j- 1)- 
fold degenerate, in accordance with the number of possible J z pro¬ 
jections. Let us now apply the general rule for finding corrections 
to the energy levels due to the perturbation. From Eq. (32.22), 
the correction to the level is calculated with the help of matrix 
elements of the perturbing energy taken between separate undis¬ 
turbed degenerate states. In the special case, when there are only the 
matrix elements between the same degenerate states, the energy 
correction is simply equal to the diagonal matrix element, F nX n x- 
In this case the undisturbed state is degenerate in the angular 
momentum projection / z , but the splitting components in a weak 
field correspond to the same J (J = n, J z = X). The first term 
in the perturbing Hamiltonian (33.49) is J z itself (H is directed 
along z ). Let us show that the second term ( S z ) possesses only matrix 
elements diagonal in / z , if we take them between states with the 
same values of the total angular momentum J. And since the magne¬ 
tic field is by definition weak, it does not alter /. 

We write the following operator: 

Sz (Jx + Jy + S\) —J Z {S X J X -\- SyJ y + S 2 J z ) 

"f" {S Z J x j Z$x) SX “H (S Z J y J z Sy) J y 

Here, the right-hand side of the equation was obtained by means of 
an identical transformation. The product SxJ x + S y J y + S Z J Z = 
= (S»j) is calculated in the same way as (L»S) from (33.47). It is 
diagonal for the given multiplet component; + 
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is also diagonal, whence 

^ = 2T(7Ti)t/(/ + l)-^(^+l) + 5(5 + l)] 

” / (/+i) (yyJ*—y*Jy) 

where v is determined as follows: 

y% = S z Jy J Z Sy = S z Ly - L z Sy = LyS z L Z S y 

y y = J z$x — S z f x = L Z S X — S Z L X = L Z S X — L X S z 

y z = L x S y — L y S x (33.50) 

It will be shown in Exercise 4 that y has no matrix elements dia¬ 
gonal in J. Consequently, its mean value over the undisturbed state 



of the system, that is, over one multiplet component corresponding 
to a definite J, is zero. Thus, S z is proportional to / z , at least in 
the required approximation. From this we immediately obtain the 
expression for the energy splitting in a magnetic field: 

E = $\ H\]J Z [l + J (/+1>S (S±i) '] < 33 ' 51 ) 

The expression in the brackets is known as the Lande factor . 

Let us now show how Eq. (33.51) is developed on the basis of 
the vector model. Figure 40 presents a triangle which in the absence 
of a magnetic field rotates very rapidly around side J. In an external 
magnetic field H, which is sufficiently weak, side J itself is in a sub¬ 
stantially slower precessional rotation about H, remaining at a con¬ 
stant angle to it, since we assume J z to be a constant of the motion; 
actually this is valid only insofar as / is a constant of the motion, 
as we have seen from the strict proof. 
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Owing to the rapid precession of the triangle about J, only the 
projection of S on J, equal to 

c J(S-J) 

remains approximately constant. The term (S-J) in the numerator 
of the expression above is found from the equation 

L z = (J — S) 2 = / 2 +<S 2 — 2 (J*S) 

whence 

( J. S) = J(J+j) + S (5+1) — L(L+1) 

Substituting S z for S in Eq. (33.49), we again arrive at Eq. (33.51). 
Thus we obtain a graphic interpretation of the Lande factor . 

(ii) Strong fields. Suppose the magnetic field is very strong, so 
that the perturbing energy is much larger than the distances between 
the multiplet components. In terms of the vector model this means 
that the orbital and spin angular momenta are in much faster pre- 
cessional motion about H than about the third side J of the triangle. 
But as a consequence of magnetic anomaly, the precession of the 
spin angular momentum is twice as fast, therefore the coupling 
in the triangle breaks down. This means that in (33.49) it is more 
convenient to use the first form of notation, in terms of L and S r 
rather than J, and find the eigenvalues L z and S z separately. 

The magnetic field breaks the coupling between L and S, but 
it is still too weak to lead to transitions between different values 
of L and S , to which correspond, as we have seen, energy intervals 
of several electron volts. That is why the eigenvalues of the opera¬ 
tor (33.49) are simply obtained if we go through all the different 
projections L z and S z : 

E = p | H | (L z + 2S Z )\ (33.52) 

The two types of multiplet splitting of the multiplet level in 
a magnetic field manifest themselves in very different ways in atomic 
spectra, which will be examined in Section 36. 

The Atom in a Constant, Homogeneous Electric Field (the Stark 
Effect). We shall now consider the behaviour of a multiplet level 
in an external electric field, starting with the case of a weak field, 
when the shift in the levels due to the field is small in comparison 
with the natural splitting of the multiplet. 

First of all, it should be borne in mind that the angular momentum 
projection on the electric field is determined only to within sign, 
because the angular momentum is a pseudovector, while the electric 
field is a true vector. In a reversal of the signs of all the coordinates > 
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the signs of the angular momentum projections do not change, where 
as those of the electric field projections do. But since the choice 
of a right-handed or left-handed coordinate system is arbitrary, 
the angular momentum projections on the electric field are physi¬ 
cally determined only to within sign. 

If J is an integer, there exist J + 1 angular momentum projec¬ 
tions on the electric field (0, 1, if / is half-integral, the 

total number of projections is J + 1/2 (1/2, 3/2, . . ., /). The 
state with J = 1/2 is not split by the electric field at all. Thus, 
splitting in a magnetic field is more complete. 

In a strong electric field the coupling between L and S breaks 
down. Then the splitting pattern is as follows. Vector L is an inte¬ 
ger. It has L + 1 projections on the electric field. Since it is coupled 
more strongly with the field than the spin vector, whose coupling 
with the electric field is of the same type as that of spin-orbit coupl¬ 
ing in a multiplet, the splitting of a level in a strong electric field 
is primarily determined by the absolute value of the projection 
on the field L z . For a given value of the L z projection the spin pro¬ 
jection with respect to L z already has 25 + 1 values, since L and S- 
are both pseudovectors. We obtain L + 1 separate groups of 25 + 1 
levels each. 

The only exception is the group in which the orbital angular mo¬ 
mentum projection on the field, L 2 , is zero. In it spin splitting takes 
place into 5 + 1 or 5 + 1/2 levels, depending on whether 5 is an 
integer or half-integer. 

The magnitude of the splitting is determined by the relative- 
shift of neighbouring levels. As was shown in Section 31, the level 
shift equals the mean value of the perturbing energy with respect 
to the undisturbed motion. From (16.28), we have an expression 
for the perturbing Hamiltonian in an electric field in the form 

F e i=a —(d-E)| (33.53) 

It can easily be shown that the mean value of this quantity is 
zero. Indeed, the wave function of an atomic state with a given J 
is always either odd or even (for the case of the hydrogen atom see- 
below). Therefore the product is necessarily an even function 

of the coordinates. But then the mean value of the operator (33.53) 
involves an odd function of the coordinates in the integrand, the- 
dipole moment (cf. (25.19)), so that the whole integral identically 
becomes zero. 

The splitting of levels is obtained only in the second approxima¬ 
tion and is therefore quadratic with respect to the external field. 
But, as will be shown, this refers to a field which does not disrupt 
the coupling between L and S (weak fields). 

In a hydrogen atom the energy of the electron is determined only 
by the principal quantum number n and does not depend upon L 



474 


Fundamental laws 


Therefore the state with E = E n is seen as a superposition of states 
with different l from 0 to n — 1. But for l even, the wave function 
is even, and for l odd it is odd. Consequently, the function with 
E = E n has no definite parity, so that the mean value of the dipole 
moment does not become zero. That is why in a hydrogen atom the 
splitting of lines depends upon the electric field linearly. 

Note that n and / are involved in the relativistic formula for the 
energy of a hydrogen atom. For a given j the orbital moment l = 
= j ± 1/2, and as in the nonrelativistic approximation the state 
with given n and j has no definite parity, which makes for the linear 
splitting effect. 

Highly excited atomic states always more or less resemble the 
state of the hydrogen atom, because the nucleus and the atomic core 
act on an electron far from the nucleus like a point charge. The energy 
•of these states depends upon l according to Eq. (29.51). If the distur¬ 
bance caused by the field is stronger than the dependence of the 
energy level upon Z, then a linear splitting effect is observed. 

A constant electric field not only displaces the energy levels of 
the atom but quantitatively alters its whole state as well. Let us 
write the potential energy of an electron in an atom subject to the 
•action of an external electric field E directed along the z axis: 

U = U 0 (r) + e \ E \ z (33.54) 


For a sufficiently large and negative z, the potential energy far 
from the atom is less than in the atom, but this domain of values 
of z is separated from the region of motion of the electron in the 
atom by a potential barrier. There always exists a nonzero probabil¬ 
ity of a spontaneous tunneling of the electron through the poten¬ 
tial barrier into a free state. Such phenomena were examined in 
Section 31. 

Whatever the state of an atom, in an external electric field it is 
capable of spontaneous ionization by an electron’s penetration of 
the potential barrier, just as a nucleus disintegrates spontaneously 
with the emission of an alpha-particle. Of course, if the field is weak 
the decay probability is negligibly small. The barrier becomes 
penetrable in a strong field, especially for highly excited atomic 
states. If the time of spontaneous electron emission in such a state 
becomes less than the time of quantum emission, the lines in the 
spectrum corresponding to transitions from this state disappear. 

Thus, a perturbation that is weak inside the atom (the atomic 
unit of field strength | E | = mV/fe 4 = 5.13 X 10 9 V-cm" 1 , so that 
an external field is always weak in comparison with the atomic 
field) nevertheless significantly affects the state, because the condi¬ 
tions at infinity are changed. But if the broadening of atomic levels 
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due to their finite lifetime with respect to spontaneous ionization 
is small in comparison with the distance between the levels, the 
levels may continue to be treated as discrete. 


EXERCISES 


1. Find the possible states of a system of two ^-electrons with the same 
principal quantum numbers. 

Solution . Each electron can occur in ten states: 

111 1 

* 2 .t ; S:1 ’-b C: 0 -t ; 

111 
E: F: 2, ~ T ; l G: 1, - T 

11 1 

/:-1,-4; /: -2. - T 

The states with positive projections of spin and orbital angular momen¬ 
tum are 

AB : 3, 1; AC: 2, 1; AD: 1, 1; AE: 0 t 1 

AF: 4, 0; AG: 3, 0; AH: 2, 0; AI: 1, 0 

AJ: 0, 0; BC: 1, 1; BD: 0, 1; BF: 3, 0 

BG: 2, 0; BH: 1, 0; BI: 1, 0; CF : 2, 0 

CG: 1, 0; CH: 0, 0; DF: 1, 0; DG: 0, 0; EF: 0, 0 

Choosing the states with maximum angular momentum projections, 
we obtain three resultant states with zero spin: 

or *Z)f, *G§ 

and two states with unity spin: 

3 P, 3 F, or 3 Pf, 3 Pf, 3 Pg, and 3 Fg, 3 Pf, 3 Pf 

2. Show that in a system of four p-electrons with the same principal 
numbers the states are the same as in a system of two p-electrons; in other 
words, that two electrons have the same states as two “holes”. 

3. Calculate the total energy of the electrons in an atom according 
to the Thomas-Fermi method. 

Solution . From (28.25), (33.26), and (33.28), the total kinetic energy 
of all the electrons in an atom is 


^kin — 


(2 m ) 312 

5jiW 


j e\' 2 X 4nr» dr = X 4n f (e<p) 5/2 r* dr 

0 0 


because the boundary kinetic energy of the electrons is eq>. Instead of ecp 
we substitute Ze 2 tylr and pass to the dimensionless variable x in accordance 
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with (33.33). Then for we obtain 


00 



The potential energy falls into two parts: the potential energy of the 
electrons’ interaction with the nucleus 

oo 

£ $t = - j ~T" n X 4itr* dr 

0 

and the interaction energy of the electrons among themselves: 

oo 

i f Ze 2 

E $t\=T ) ^-(i-^nx^dr 

0 

The factor 1/2 takes account of the fact that each electron should be counted 
once. Adding both sides of the potential energy and passing to the dimension¬ 
less variable, we obtain 

i x 

The integrals involved in this expression for the energy are easily cal¬ 
culated with the help of Eq. (33.34), namely, 


oo oo 

J ^ 3/2 -^72-= 5 -t'(0) 


0 0 

because (oo) = 0. We transform the second integral by parts to get 

oo oo 

j ^/2_^_ = 2x l/2 t 5/2 |“_ 5 f I l/2 lj) 3/2 t , & 

0 x 0 

oo oo 

= —5 j tJj'iJj ’xdx= -1-j i'Ydx 

0 0 

oo oo 

=—j (t') 2 ^=4 J (♦')•<*» 

0 

since the integrated expressions are equal to zero. Further 

dx= — (U)— j if*""- dx 

0 


0 


OO oo oo 

j dx = o — j dx=—^' (0)— j i]) 5/2 -jT7r 
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since i|) (0) = 1. Hence 

0 

Substituting these expressions for the integrals into 2? k | n and 2? po t» we 
note that £ po t = —J2^ kln , so that the total energy is equal to — £ k i n , in 
agreement with the result of the exact theory set forth in Exercise 1, Sec¬ 
tion 32. The quantity ty' (0) is equal to —1.589. From this we obtain the 
following formula for the total coupling energy of all the electrons of an 
atom: 

£= —0.769 -^-Z 7/s = — 20.94Z' 7/3 eV 
h 2 

For example, for uranium E = —8 X 10 5 eV, or —l,6mc 2 . 

The dependence Z 7 / 3 is easily obtainable without computations in the 
following way. The Coulomb forces decrease slowly with distance. Therefore 
all A the electrons interact in pairs. There are Z 2 pairs. The mean distance 
between the electrons decreases as Z -1 / 3 (see (33.33)). This yields Z 7 / 3 . Note 
that for nuclei the coupling energy is proportional to the first power of the 
number of particles (within broad limits). This is an indication of the small 
distances at which nuclear forces act: each nucleon, that is, proton or neu¬ 
tron, interacts not with all the other nucleons but only with its immediate 
neighbours. 

4. Determine the nonzero matrix elements of y defined by Eqs. (33.50) 
with respect to J and J z , that is (y)jj z j'j' z . 

Solution . We find the computation relations for the components of 
the total angular momentum and y* For example 

•^xYy Yi/^x ~ Jx {L Z S X L X S z ) — (L Z S X L X S Z ) J x 

= (L x -f- S x ) (L Z S X — L x Sz) — (Mx — L X S z )l(L x + S x ) 

= iLyS x -f- it x S y = iy z 

In the same way we obtain the following commutation relations: 

*^yYx V xJy =—*Y z J zYx— Yx^z = l Yy 

^xYz Y Z^x— *Y y *^?/Yz — Yz^y = *Yx 

^zYy-Vz=—*Yx 

The like components of vectors J and y commute. It can easily be verified 
that these commutation relations are the same as the commutation relations 
for the orbital angular momentum components of a particle, M, with Caiv 
tesian coordinates. 

Since f z commutes with y z , they are diagonal in the same representa¬ 
tion; hence, only the matrix elements (y z)jzj' z with the same J z are nonzero. 
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Further, we take the commutation relations for J z and the components y x 
and We multiply the second by ±i and add with the first, obtaining 

J 2 {y x ± iy y )—(Yx ± tiy) h=±(y x ± *v y ) 

We denote y± «= (y x + iy y ) and take the matrix element of the com- 
mutation J 2 y± — y±J z = ±Y±- Since J z is diagonal, we have 




Transferring all the terms to the left-hand side of the equation, we find 
(Jz-J' 2 + 1) (Tf±) j j; = 0 


We see from this that the matrix element (y±) j z j' z is not zero only for J&= 
= J z ± !• 

The calculations presented here show how to find nonzero matrix ele¬ 
ments. A commutator should be developed between an operator which is 
regarded as diagonal in the given representation and the operator whose 
matrix elements are being investigated. Then this matrix element can be 
taken outside the parentheses. For the matrix element itself not to vanish 
the expression in parentheses should be zero. 

For this we must sometimes perform the commutation twice. Let us 
examine this case. We develop the commutator 

l>Yzl s (/* + /5 + /| ) Y*-Y*(/i + /i + /| ) 

=(J%+Jy)y z-y* Vl+Jl) 

= ^*Yz JxYz^X jxYzJ X fz^x 

+ Jftz — J yVzJviz*; — y t Jy 

= Jx (JxYz — Y zJx) + 0xYz — «)/, 

+ Jy(jyyz — yzfy)+(J V yz — yzfy)J» 

= — iJ x y y —iy y J x+iJ y ix+tyxJT 

= 2 U yix + Yz — 2i/ x y y — Yz 

= 21 (Jyy x -J x yy) 

By analogy, we write two more commutators: 

[^ 2 y*I=2 i(J z iy—J y y z ) 

[/ a v y ] = 2i (/xiz — jzix) 

We now find the second commutator: 


W 2 [>Yzl] = 2 i (J y [/*Y J - h [/*?„]) 

= 4/*y z 4/ yJ z y y 4/zYx 4/|y t 
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Here, we should take advantage of the fact that all the components of J 
commute with J 2 . Here we substituted the expressions for the first com¬ 
mutators [/?y x ] and [/ 2 y y ] into the right-hand side. We make use of the* 
commutation relation for the angular momentum component; then the* 
second commutator reduces to 

V 2 W 2 izl] = V 2 yz - 4 (J z fy + tJ x ) y y - 4 {jJx - tfy) y x - 4/|y* 

= 4/ 2 y 2 - 4i (/ x y y — J y y x ) — 4 / z (J -y) 

= 4/ 2 v z — 2 (J 2 y z — y z J 2 ) —4/ z (J.y) 

But (J -y) is identically zero, so that 
[J 2 [J*y z ]] = 2J*y z + 2y z f* 

Thus we have obtained the relationship between the operators containing* 
only the diagonal operator J 2 and y z : 

(J 2 ) 2 y z - 2 J 2 y z J 2 + Y Z (J 2 ) 2 - 2J 2 y z - 2y z J 2 = 0 
From the obtained equation we now form the matrix element with 
indices / and /'. In this, instead of J 2 and J' 2 we must write / (/ + 1) 
and /' (/' + 1): 

J 2 (J + l) 2 (y z ) JJf -2/ (/ +1) (Yz)jj' /' (/' +1) 

+ (Vz)jj' J' 2 (/' +1) 2 -2/ (/ + 1) (y z ) jr -2(y z ) jr (/' + 1) = 0- 

We transform the factor of (y 2 )jj* thus: 

/2 (/ +1)2 - 2/ (/ +1) /' (/' +1) 

+ /' 2 (/' + 1 ) 2 _ 2 / (/ + 1 ) — 2 /' (/' + 1 ). 
= / 4 +2/ 3 + / 2 -2//' (//' + / + / / + l)+/ /4 + 2/' 3 
+ /' 2 — 2/2 _ 2 / — 2 / '2 — 2 /* 
= (/ + /') [(/-/') 2 (/ + /') 

+ 2 (/'2 — //' + /' 2 ) - 2 //' — (/ + /') - 2 ]- 
= (/ + /')(/ + /' + 2 ) (/-/'- 1 ) (/ — /' + 1 ) 

Thus we arrive at the equation 

(/ + /' + 2 ) (/ + /') (/ — /' — 1 ) (/ — /'+ 1 ) = 0 

The first factor cannot vanish. The second is equal to zero only if / = 
= /' = 0. This can never happen in a multiplet. But furthermore, the- 
conditions for (y±) j z j' z preclude the possibility of / = /' = 0. It follows 

that the only nonzero matrix element ( y z )jj 9 is obtained for /' = / + 1. 
But such a matrix element refers to two different components of the mul¬ 
tiplet, so that in averaging over one component it makes no contribution* 
to (S z ). 

We shall make use of the rules for (y z )jj 


' in Section 36. 
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34 


DIATOMIC MOLECULES 

Homopolar Bonding. Chemical bonding is, in the final analysis, 
always due to electrostatic interaction between the nuclei and elec¬ 
trons of two atoms. However, for a comprehensive qualitative as 
well as quantitative explanation of it, quantum mechanics is es¬ 
sential. Bohr’s orbits, which were once used to explain the stabil¬ 
ity of atoms, lack elementary mechanical stability when applied 
to molecules, that is, they cannot bind atoms together. 

There are two different types of chemical bonds: those in which 
the electric charge passes in part or in whole from one atom to an¬ 
other, and those in which the atoms remain strictly neutral. The 
former is known as heteropolar, or ionic, bonding, the latter, as 
homopolar , or covalent , bonding . 

In Section 32 it was stated that the electron shell of an atom of 
fluorine is one electron short of the fully occupied shell of neon. 
When fluorine combines with hydrogen, the electron of the hydrogen 
atom transfers to the fluorine atom, which becomes negatively 
charged and attracts the hydrogen nucleus (proton). The combina¬ 
tion of these two atoms is sufficiently stable, and energy is evolved 
in the process. Quantum mechanics, of course, is required for a 
comprehensive calculation of such a system, the same as for all 
atomic calculations, but from the qualitative point of view, at least, 
it is clear why a negatively charged particle combines with a posi¬ 
tively charged one. 

When two hydrogen atoms combine into a hydrogen molecule 
no charge transfer occurs (experimental data reveals that a hydrogen 
molecule has no dipole moment). Therefore the simple model of two 
attracting opposite charges is at the very least inadequate to explain 
the nature of homopolar bonding. The quantum mechanical expla¬ 
nation of the mechanism of such bonding was offered by W. Heitler 
and F. London in 1926. 

In the zero (initial) approximation of the Heitler-London method 
the atoms are considered to be independent. Each electron is asso¬ 
ciated with its nucleus. We denote the nuclei by the letters a and 6, 
and the electrons by the subscript 1 and 2. The interaction between 
the atoms is not taken into account in the initial approximation. 
The wave functions of the electrons are, respectively, ^ a (ri) and 
A|)&(r 2 ). But it is apparent that the state of the system is degenerate, 
because a system of the same energy results if we take the electron 
functions i|) 0 (r 2 ) and The wave function of the zero appro¬ 

ximation for a degenerate state is developed according to the general 
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rules of perturbation theory (see (32.20)-(32.23)) in such a way as to 
diagonalize the perturbing Hamiltonian. 

Like any Hamiltonian of a two-electron system, a Hamiltonian 
describing atomic interactions is symmetric with respect to an inter¬ 
change of the two electrons. Hence, if we take 

V s = (ri) 1|)& (r 2 ) + I|) a (r 2 ) i|>& ta) (34.1) 

and 

V A = (ri) t|)6 (r 2 ) — T|) a (r 2 ) 1 JJ 6 (ii) (34.2) 

as the zero-approximation functions and, as always, denote the 
perturbing Hamiltonian V lt , then only the matrix elements 

(V lt )s8 = j Vgi> 12 V s dx 1 dx , (34.3) 

(V 12 ) aa = j nF 12 V A dx 4 dx 2 (34.4) 

do not vanish. 

Both wave functions (34.1) and (34.2) must be made to satisfy 
Pauli’s exclusion principle. For that *F S should be multiplied by an 
antisymmetric spin function, and by a symmetric spin function. 
Then the Ts state will correspond to total zero spin, and the W A 
state to total spin unity (Sec. 33). 

Furthermore, both functions T* s and Y A must be normalized, 
that is, separated into 

J I Vs I* d Xl dx 2 ) i/2 

and iV A = ( j | V A |*da ***,) 1/2 

In future we may write a scalar argument instead of the vector 
argument r in the wave functions, since it is assumed that both 
hydrogen atoms are in the ground state, with spherical symmetry. 
Accordingly, the arguments involve four distances, r ttl , r& 2 , r fl2 , 
and r bl , while *F S and T'a should be written as 

V s = (N s ) -1 bp (r ai ) i|> (r 6s ) + yp (r bl ) i|> (r Q2 )] (34.5) 

V A = (^ A )- X [t|) (r ai ) i|> (r b2 ) — a|) (rbi) ^ (^ 2 )] (34.6) 

Here, yp is one and the same function, that is, the function of the 
ground state of the hydrogen atom. From (29.23), it is equal to e~> 
where £ is the distance from the nucleus in atomic units. There is 
no need to normalize it, since the functions W A and Ts are subse¬ 
quently normalized anyway. 

. 31-0452 
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We now write the Schrodinger equation for a hydrogen molecule: 

h 2 w—r 9! h 2 h 2 „ 9 fr* 2 

■ v 2 


2m 


-VI- 


2/71 ▼ 


■Vg 


2m 

p 2 


v?- 


2 m 


r b 2 


r b! 


< 34 - 7 > 


The first two terms describe the motion of the nuclei in the mole¬ 
cule. They involve the proton mass in the denominator and are 
therefore small in comparison with the terms describing the motions 
of the electrons. Physically, this means that the nuclei are moving 
much slower than the electron, so that the electron wave function 
may be determined for a fixed distance between the nuclei; E de¬ 
pends explicitly on the distance between the nuclei, which is in¬ 
volved as a parameter in the left-hand side of (34.7). 

We regard E (r a i>) as the potential energy of the nuclei for the 
given distance between them. If this function has a minimum denot¬ 
ing the state of stable equilibrium of the nuclei for the given quan¬ 
tum electronic state, the atoms can combine into a molecule. In 
future we shall not write the terms corresponding to the kinetic 
energy of the nuclei—they have to be taken into account when vibra¬ 
tional, rotational, or translational motion of a molecule is con¬ 
sidered, but the stable equilibrium configuration, which is deter¬ 
mined by the electron state, should be defined without account of 
— (h 2 l2m p ) Va and —fe 2 /2m p Vb* 

The perturbing Hamiltonian, which was denoted is trans¬ 
ferred into the second line of Eq. (34.7). Thus, in the first approxi¬ 
mation we obtain the corrections to the energy of the unperturbed, 
ground state of the atoms by substituting the wave functions (34.5) 
and (34.6) into Eqs. (34.3) and (34.4) and calculating the integrals 
of the known functions. We should not expect great precision in 
this method, because when the kinetic energy of the nuclei is dis¬ 
carded, Eq. (34.7) does not involve a small parameter; but in the 
first, semiquantitative, approximation it is nevertheless possible 
to encompass the main features of the posed problem. 

First of all, it should be expected that a lower energy corresponds 
to the Yg state than to the state. Indeed, T* A vanishes in the 
plane perpendicular to line r a b at half the distance between the 
nuclei, that is, in the median plane. Thus Wa has a node. But Tg 
has no nodes, whence it follows that a smaller energy corresponds 
to it. 

Integrals (34.3) and (34.4) differ in one term: 

±A-±«tW J ♦(rj»(r h ) 


X ^ (r 02 ) ^(r&,) dx x dx t (34.8) 
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which is similar to the exchange integral in the equation for expres¬ 
sing energy by Fock’s self-consistent field method (see (33.20)). 
But in the present case the integral involves known functions, so 
that it can be calculated and the curve E (r a b) can be plotted. The 
quantity A in (34.8) is also called an exchange integral. The cor¬ 
responding curve indeed has a minimum for a certain value, r oa &„ 
the depth of the potential well corresponding to the binding energy 
of a hydrogen molecule to the extent that can be expected in the- 
given approximation. If necessary, agreement with experimental 
data can be enhanced by applying certain computational 
methods. 

The curve with the minimum corresponds only to a symmetric 
spatial wave function or to an antisymmetric spin function. We- 
obtain, as in the atom, the effective spin interaction energy, which 
can be represented with the help of the operator (a 1 *o 2 ). 

A basic feature of valence forces is that they are capable of becom¬ 
ing saturated. A third hydrogen atom does not combine with 
a diatomic hydrogen molecule. This follows from the Heitler-London 
theory: the electron spin in the third atom is necessarily parallel 
to the spin of one of the two atoms in the molecule, so that the extra 
atom cannot combine with it. The scheme of mutually saturating 
spins can be applied to a very large class of compounds, especially 
organic compounds. 

However, another approach to the problem of valence is possible, 
and in some cases absolutely essential. Let us start again with the 
hydrogen molecule, not with separate atoms this time, but with 
atoms brought so close together that their nuclei merge. Obviously, 
in its electrostatic action on the electrons, such a nucleus is equiv¬ 
alent to a helium nucleus, and the electrons of the atoms form an 
extremely stable shell, as in helium. If the nuclei are not held at 
one point, the Coulomb repulsive forces push them some distance 
apart, and the electron shell stretches. At some distance between 
the nuclei the force appearing in the deformation of the shell ba¬ 
lances the repulsion of the nuclei, thereby forming a stable molecular 
configuration. Here it is also obvious that a third hydrogen atom 
cannot join it, since there is no place for its electron in a helium 
shell. 

Let us now consider the oxygen molecule. Every oxygen atom 
has four electrons in the 2 p shell and spin unity. As is known from 
experimental data, an oxygen molecule also has unit spin, conse¬ 
quently spin saturation does not occur. This can be explained by 
examining the common shell of two oxygen atoms combined in 
a molecule. Assuming that it has six vacancies as in the initial 2 p 
shells of the atoms, we see that in combining the extra two electrons 
pass into a different state. According to Hund’s rule their spins are- 
parallel, which explains the value of the spin of the oxygen molecule. 

31 * 
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The Electronic States of Diatomic Molecules. Let us consider the 
electronic terms of a diatomic molecule. The field of a diatomic 
molecule has axial, but not spherical symmetry, like the field in 
an atom. Therefore the total orbital angular momentum of the 
electrons is not an integral of the motion. Only its projection on 
the line joining the nuclei is conserved. Like the projection of any 
orbital angular momentum, it takes on integral values: A = 0, 1, 
9 

, • • • • 

The terms are accordingly denoted as 2, II, A, .... Upper-case 
Greek letters are used so as not to confuse molecular states with 
atomic states, for which Latin letters are used. The notation takes 
only the absolute value of the projection into account. 

The Hamiltonian of the electrons in a diatomic molecule posses¬ 
ses mirror symmetry with respect to a plane through the nuclei. 
If we denote a coordinate on an axis perpendicular to this plane £, 
a substitution of —£ for £ leaves the Hamiltonian unaffected. It 
will readily be seen that the sign of the angular momentum projec¬ 
tion on the axis is changed at the same time. Indeed, let r) be a coor¬ 
dinate lying in a plane perpendicular to the line joining the nuclei. 
Then the angular momentum projection on the axis joining the 
nuclei is Mi = £ p ^ — r)pg. In a reflection in the plane g and pi 
change their signs, while r\ and p ^ do not, so that Mi changes its 
sign too. For that the projection on the axis must not be zero. 

Hence, states with A += 0, that is II, A, . . ., are 2-fold degen¬ 
erate: the angular momentum projection on the axis can have two 
signs for the same energy. 

Matters are different with the 2 states. In a mirror reflection 
in the plane, the function of such a state can only be multiplied by 
a certain number, C, because there is only one function. In a second 
reflection it is again multiplied by that number, that is, already by C 2 . 
But at the same time it must revert to its initial value. Consequently 
C 2 = 1 , C = ± 1 . 

In other words, there are two different 2 states. In one the wave 
function changes its sign in a reflection in a plane through the 
nuclei, in the other it does not. The first state is termed positive 
and denoted 2 + , the second is negative and denoted 2". 

These are two quite different electronic states of the molecule, 
and their energies correspond to quite different potential curves 
E + ( r a b ), E~ ( r a b )• The curves are obtained for different initial 
states of the atoms combining to form a molecule. 

The electrons in a molecule may have a total spin S. If there is 
an odd number of electrons, then S is equal to at least 1/2. If the 
spin-orbit interaction is not great, the spin orients freely with re¬ 
spect to the orbital angular momentum, and for a given A produces 
a (2 S + l)-fold degeneracy (we recall that spin and orbital angular 
momenta are pseudovectors, so that in stating A we obtain 25 + 1 
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projection of S). The number 25 + 1 is written, as for atomic terms, 
as a superior prefix attached to A: 25+1 A. 

If a molecule consists of two like atoms, the electron wave func¬ 
tion possesses an additional symmetry with respect to the median 
plane between the nuclei perpendicular to the line joining them. 
A reflection in this plane does not alter the sign of M& because here 
for all A’s there may be two states, even and odd, which in the term 
notation are written not as for atoms, but as a subscript: Ag, A u . 

For very many molecules the ground state is 1 2 + , and if they are 
made up of the same atoms, *2g. The fact that 5 is equal to zero is 
explained by the mutual saturation of spins in homopolar molecules. 
In heteropolar molecules the shell of one of the atoms may be com¬ 
pletely filled, as we saw in the case of HF. Here the spin is also equal 
to zero. Unlike II, A, . . ., the 2 state has no degeneracy, and hence 2 
is preferable for the ground state. Finally, unlike odd and negative 
states, even and positive states need not necessarily have nodal 
surfaces on the planes of symmetry. Therefore, the ground states 
are as a rule even and positive. 

However, there are exclusions from the rules: the ground state 
of an oxygen molecule is 3 2 g. An acceptable explanation of the 
reason why the spin of the oxygen molecule is unity was cited above. 
We shall now show how the negative value of the term can be ex¬ 
plained. If the spins have equal projections, the spatial states of the 
electrons must be different. In the initial atoms these were p states, 
whose wave functions are Yj 1 , YJ, and Y\. But in the present case 
we must take Yj 1 and Y\ so as to obtain a zero projection of the 
angular momentum on the symmetry axis for an antisymmetric 
spatial function. 

Since the total spin is unity- the spin function of both electrons 
is symmetric with respect to tneir interchange (Sec. 33). Therefore 
the spatial function must be antisymmetric. 

If it is developed from the functions of the electrons in the atoms 
it has the form 


<Pi)ir(0., %) -im, <p 2 ) <p,) 

= sin d 1 ! sin ^ — g-i(<Pi-(P 2 )) (34.9) 

In a reflection in a plane through the nuclei the angle <p = 
= arctan (£/r]) changes its sign together with £. Hence the func¬ 
tion (34.9) also changes its sign, due to the factor in the parentheses. 
This explains why the ground function is of necessity 2“. 

Of course, spherical functions are not exact solutions of the elec¬ 
tronic states in a molecule, but such a wave-function characteristic 
as the number or location of nodal surfaces with respect to the sym¬ 
metry planes does not depend on the fine form of the force field. 
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Rotation of Nuclei. Let us now consider the motion of a molecule 
as a whole, restricting ourselves to a singlet state 13 (5 =0). 

The initial approximation in the theory of molecules consists 
in considering the motion of electrons when the position of the 
nuclei is assumed to be fixed. This makes it possible to determine 
the energy of the electron term as a function of the distance between 
the nuclei, E ( r a b)• For the sake of brevity we shall write simply r, 
and replace E by £7, since this function takes the place of potential 
energy in nuclear motion. 

In the next approximation the states of motion of the nuclei for 
the given function U (r) are defined. In other words, their states 
averaged over the motion of the electrons are considered. 

Let us describe the motion in the centre-of-mass frame of the 
nuclei. As we know from Section 3, a two-body problem reduces 
to a problem on the motion of one body of mass equal to the reduced 
mass of the two bodies, m'. Since the potential energy of the nuclei 
depends only upon the distance between them, their Hamiltonian 
is conveniently expressed in polar coordinates: 


$8 n 


Pr . K*M* 
2m* + 2 m'r 2 


f U(r) 


(34.10) 


In this formula, M 2 is not an integral of the motion, since the nuclei 
of a molecule do not really form a closed system: they interact with 
the electrons. Whatever molecule we have, its total angular momen¬ 
tum K is an integral of the motion, as in every closed system. It is 
compounded of the total electron angular momentum, L, and the 
total nuclear angular momentum, M. Therefore, in Eq. (34.10) we 
substitute instead of M 2 the difference between the total and elec¬ 
tron angular momenta of the molecule: 

M 2 =(K-L) 2 (34.11) 

after which we average this operator over the motion of the elec¬ 
trons. 

Consider the means of different terms in Eq. (34.11). We have 
(M 2 ) = (K 2 ) + <L 2 > — 2 <(K-L)> (34.12) 

We note that K 2 is an exact integral of the motion equal to K (K + 1)» 
and there is no need to average it; (L 2 ) does not depend on the 
motion of the nuclei. There remains the third term, which is depend- 


13 Atomic and molecular multiplets are designated as singlets, doublets, 
triplets, quadruplets (or quarters), and further as in music: quintets, sextets, 
s ptets, etc. 
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ent upon K and L. Only the projection of L on the molecular axis £ 
equal to A is diagonal. 

The operators of the projections on the other axes do not possess 
diagonal elements (see (30.18)). Consequently only the mean value 
of the projection on the £ axis, that is, the integral of the motion A, 
is other than zero. It remains to find the mean value ((K*L)) 

= (K-<L)). 

The orbital angular momentum operator is defined as 

M = r X P 

where r is the separation of the nuclei and is by definition directed 
along the molecular axis, and p is the momentum of the relative 
motion of the nuclei. Thus 


(r-M) = r (r X P) = (r X r ) P= 0 
identically, or 

?(K — L) = 0 and (t-K) = (?•£) (34.13) 

so that the total angular momentum projection is equal to the elec¬ 
tron’s angular momentum projection on the same axis, that is, it is 
equal to A. The maximum projection of K on an arbitrary fixed 
axis in space will thus take on values starting with A, that is A, 
A + 1, A + 2, ... . The greatest projection of K on an arbitrary 
axis cannot be smaller than A. 

Thus, the mean value of the scalar product 


<(K • L)) = (K. <L» = (K. n) A = A* (34.14) 

where n = r It. Like (L 2 ) it does not depend upon the motion of 
the nuclei. 

From this we find the Hamiltonian of the nuclei, averaged over 
the motion of the electrons: 


(SS)e 1 s 


2m' 


•U(ry 


h 2 ((L 2 ) — 2 A a ) , h 2 K(K+ 1) 


2 m'r 2 


2m'r 2 


(34.15) 


Here it is convenient to include a term depending upon (L 2 ) and A 2 
in the potential energy by analogy with the way the centrifugal 
energy is included in U M of the motion in a central field. 

Usually, the equilibrium distance between the nuclei, denoted r e , 
is substituted into the denominator of the last term. Then m’r\ 
represents the molecule’s moment of inertia about its centre of mass. 
It is apparent that in a system of two mass points only two, and 
equal, components of the inertia tensor with respect to the principal 
axes, drawn through the centre of inertia, are other than zero. The 
moment of inertia about the symmetry axis is zero. 
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Energy Levels of a Molecule. Close to the potential energy mini¬ 
mum, a system is capable of small vibrations. The vibration frequen¬ 
cy in a diatomic molecule is conventionally denoted <D v ib, and the 
corresponding vibrational quantum number , v. The energy of a di¬ 
atomic molecule may be represented as a sum of three terms: 

E = E el + ftco vlb (v + ±) + 1} (34.16) 

Here, E e \ is the energy corresponding to the minimum of the poten¬ 
tial curve (including the centrifugal term), and the second term is 
the vibrational energy (cf. (28.51)). (Further on it will be explained 
that Eq. (34.16) refers only to singlet molecular states.) 

Let us compare the order of magnitude of the three terms in (34.16). 
The first is independent of the mass of the nuclei and is determined 
only by the electron motion. The distance between the electron 
energy states is of the order of one or several electron volts. 

The frequency of small vibrations in the second term involves 
the square root of the mass of the oscillating particles in the denom¬ 
inator (cf. (7.12), where a is proportional to the mass). Thus, the 
electron energy of the molecule is several tens or hundreds of times 
greater than its vibrational energy. For the hydrogen molecule, 
which has the smallest reduced mass, the vibrational quantum 
/uovib is about 0.5 eV; for other diatomic molecules it is /xco V ib ~ 
- 0.1-0.2 eV. 

The third term in (34.16), which is due to the rotation of the 
molecule as a whole, involves the mass of the nuclei in the denomi¬ 
nator. The distances between the energy levels are in this case cor¬ 
respondingly smaller and comparable with the distances between 
the fine structure levels due to the magnetic interaction of the elec¬ 
trons. Under these conditions magnetic interaction cannot be consid¬ 
ered small in comparison with the rotational energy of the molecule, 
and it requires a special consideration for nonsinglet states. We 
shall not take up this question, however. 

Let us now consider the parity of the states of the molecule as 
a whole with respect to an inversion of the coordinates of all its 
particles, that is, the electrons and the nuclei. Inversion corresponds 
to a transition from a right-handed coordinate system to a left- 
handed one. In the process, the wave function may either remain 
unchanged or change sign. 

The behaviour of 2=*= in this is connected with the rotational quan¬ 
tum number K. Note, firstly, that the inversion is performed simul¬ 
taneously in the fixed and intrinsic frames of reference. In the intrin¬ 
sic frame, the 2 + terms remain unchanged, while the 2" terms change 
their sign. Furthermore, the rotation of a molecule as a whole is 
determined by the wave function, the parity of which depends upon 
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the parity of K (see Sec. 29). Therefore, the parity of the 2 + term 
is (—1) K , and that of the 2~ term is (—1) K+1 . 

The Effect of Nuclear Spin. The rotational states of homonuclear 
molecules depend on the spins of the nuclei. Nuclear spins make for 
an additional symmetry of the Hamiltonian associated with inter- 
change of the nuclei. Depending upon the spin of each nucleus, 
their wave function should either reverse its sign in a complete 
interchange of all the variables (spatial and spin), or remain unchanged. 
The first corresponds to nuclei with half-integral spin, subject 
to Pauli’s exclusion, the second corresponds to nuclei with integral 
or zero spin. In this case it is immaterial that nuclei are not ele¬ 
mentary particles. Insofar as they are moving separately, the applic¬ 
ability of Pauli’s exclusion is determined only by their total spin. 

The parity of the rotational wave function of a molecule in the 2 
state with respect to a reflection in the median plane between the 
nuclei depends only upon K. Indeed, such a substitution means 
that in place of the polar angle '& we take jt — so that cos '& is 
replaced by —cos The function Y°k is the Kih Legendre polyno¬ 
mial, whose parity is equal to (—1) K . The projection of K on the 
symmetry axis in the 2 state is zero ( K s = A), and to this state 
corresponds the wave function Yk = P K (cos -d). Thus, for even K 
the spatial wave function of the nuclei is even, and for odd K it is 
odd. The parity of the spin wave function of identical nuclei in the 2 
state of a molecule is opposite the parity of K if the nuclear spin 
is half-integral, and coincides with the parity of K if the nuclear 
spin is integral. 

Let us first consider the case when the spin of each nucleus is equal 
to 1/2, as in hydrogen. Then Pauli’s exclusion principle holds. 
A symmetric spin function corresponds to spin 1 (orthohydrogen), 
an antisymmetrical spin function corresponds to spin 0 (parahydro- 
gen). But spin 1 has three projections, 1, 0, —1, so that the mole¬ 
cules of orthohydrogen are in a 3-fold degenerate state with respect 
to the spin, while the molecules of parahydrogen are not in a de¬ 
generate state. 

In equations of the type (28.25) for the number of states of a mole¬ 
cule, the molecules of orthohydrogen must be multiplied by the 
factor 3, and molecules of parahydrogen by the factor 1. Further¬ 
more, both are multiplied by the factor 2 K + 1, according to the 
number of projections of K. 

The situation changes if we take deuterium instead of hydrogen. 
The nuclei of deuterium have unity spin, and they are not subject 
to Pauli’s exclusion principle. Their wave function is symmetric 
with respect to an interchange of the spatial and the spin variables. 
The parity of the spatial function is determined by the parity of K r 
while the parity of the spin function is determined as follows: the 
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states with total spin 0 and 2 are even, the state with unity total 
spin is odd (see the Exercise). These states are also known as ortho- 
and para-states, respectively. But unlike hydrogen, deuterium occurs 
in the ortho-state for even K , and in para-states for odd K. 

Homonuclear molecules which do not possess spin may be only 
in even rotational states, because for them the total parity of the 
wave function with respect to the interchange of variables is deter¬ 
mined solely by the parity of K. We repeat that the above reasoning 
refers only to the 2 states of nuclei. 

The magnetic energy of the interaction of nuclear spins with their 
rotation is very small: it involves c 2 and the mass of the heavy par¬ 
ticle in the denominator. But thanks to the symmetry requirements 
for the wave function, the spin state of the nucleus strongly affects 
the motion of the molecule. 

The formula for the rotational levels of diatomic molecules ac¬ 
cording to the energy dependence upon K may be applied to non- 
apherical nuclei as well. Such nuclei are due to the following circum¬ 
stance. We have seen that nucleons with large angular momenta, 
whose wave function is very far from being spherically symmetrical, 
also take part in the filling of nuclear shells. If such a separate 
nucleon is moving in the field of a symmetric core, owing to the 
asymmetry of its own state it deforms the core, which elongates 
and acquires axial symmetry instead of spherical. The rotational 
levels of the nuclei correspond to low excitation energies (on the 
nuclear scale) and can be distinguished because they obey the inter¬ 
val rule 

E k -E k . 1 = ^-K ( 34 . 17 ) 

where / is the moment of inertia of the nucleus. 


EXERCISE 

Develop the wave functions of the ortho- and para-states of a deuterium 
molecule in the electronic state. 

A nswer . 

(i) Ortho-states. Spin projection 0: 

* (1, h) * (-1, *.) + * (1. **) * (-1, *), * (0» h) V (0. **) 

Spin projection ±1: 

Y (1, Sl ) Y (0, s 2 ) + ¥ (1, s 2 ) Y (0, Sl ) 

Y (-1, Sl ) V (0, s 2 ) + V (-1, s 2 ) V (0, Sl ) 

Spin projection ±2: 

* (1. *i) * *»), 


* (-1. *i> * (-*• *.) 
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(ii) Para-states. Spin projection 0: 

'P (1, H ) Y (-1, s t ) - Y (1, » a ) Y (-1, *x) 
Spin projection +1: 

▼ (1, Sl ) Y (0, .,) - Y (1, s 2 ) Y (0, Sl ) 

Y (-1, Sl ) Y (0, * 2 ) - Y (-1, s 2 ) Y (0, *) 


35 


THE QUANTUM THEORY OF SCATTERING 

The concept of the scattering cross section of particles, which was 
•defined in Section 6 in terms of classical mechanics, is directly ex¬ 
tended to quantum mechanics. Indeed, the differential scattering 
cross section of the particles inside a given solid angle is the ratio 
of the number of scattered particles in this angle to the flux density 
of the incident particles. Since flux and flux density can be defined 
quantum mechanically, the effective cross section has the same 
sense in quantum theory as it has in classical theory. 

We shall first examine an approximate method of determining 
the scattering cross section of particles and then go over to more 
exact methods. 

The Bom Approximation. Let as determine the conditions in which 
a scattering field can be regarded as a weak perturbation acting on 
a particle. 

Let the particle’s energy at a sufficiently large distance from the 
scatterer be E, and let the potential energy be of the order of mag¬ 
nitude of U. It can be considered, for example, to be the depth of 
the potential well representing the scatterer. We shall first consider 
the case of E £7. Then the change in the wave vector of the par¬ 
ticle in the field is of the order 

[2 m(E — U)] i/2] (2 mE) i/2] ( m \i/2 U 

h h ~ [ 2E ) h 

If the dimensions of the region in which the field acts (the “width 
of the potential well”) are of the order of a, then the total phase 
change of the wave function in the scattering field is estimated as 

/ m \ i/2 Ua _ Ua 

Itf / 




492 


Fundamental laws 


where v is the velocity of the particle at a large distance from the 
scatterer. The obtained ratio must be substantially smaller than 
unity to consider the disturbance produced by the field weak. 

In the opposite case (when | U | E) the wave number “inside 
the well” (more precisely, “above the well”, because E > 0, that is, 
the motion of the particle is infinite) is equal to (2m | U |/ft) 1/2 . 

The criterion of the smallness of the phase change is 
a (2m | U |/fe) 1/2 1 (this relation does not include the energy 
of the particle at all). Comparing this with the result of Section 28, 
which refers to a well of finite depth, we see that the obtained con¬ 
dition almost coincides with the condition for the absence of bound 
states in the well, but it involves a strong inequality rather than 
a simple one. A well can be regarded as a weak perturbation of the 

initial motion of the particle only when the product a Y I U | is 
many times smaller than the quantity at which the appearance of 
a bound level is possible. 

If the scattering is done not by a potential well, but by a poten¬ 
tial hump, that is, the particles are subject to repulsive forces, 
the criterion of the smallness of the perturbation is the same, but 
it bears no relation to bound states. 

When the necessary criteria of the smallness of the perturbation 
are satisfied, we may apply the general methods of Section 32, name¬ 
ly Eq. (32.42), to the problem of particle scattering. We shall as¬ 
sume the scattering to be elastic, that is, the state of the scatterer 
does not change in the scattering. Then the energy of a scattered 
particle prior to the scattering is the same as after the scattering, 
only the direction of its linear momentum changes. 

Assuming the perturbation to be weak, we may take the wave 
functions of the incident and the scattered particle in the form of 
plane waves. This corresponds to the Born approximation . Let the 
momentum of a particle prior to the impact be p, and after the 
impact, —p', where, as just pointed out, p = p '. Both momenta are 
determined in the centre-of-mass frame of reference of the colliding 
particles, and the two-body problem is, as always, reduced to the 
one-body problem. 

Equation (32.42) involves the function g (E‘), that is, the number 
of states per unit energy interval. To make use of Eq. (28.25) for 
g (E’), we must normalize the wave functions of the free particles 
to the same volume V that is involved in g (E'). It is, of course, elim¬ 
inated from the final result. We could ultimately achieve the 
same result by normalizing the wave functions to the 8 function, 
that is, to 6 (p — p'), but we would have to alter Eq. (28.25). 

Thus, the wave functions of the initial and final states are: 

^(p) = -j^7Te ipr/ \ y( p ') = —L r eiP'r/h (35.1) 
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The function ip (p) corresponds to a flux density v/F, which can 
be seen directly from (23.19): 

3 =~ 2 j^y (e" ipr/h grad eW J — grad 


P _ v 
mV V 


(35.2) 


From (35.1), the matrix element for the transition is 

U P ' P = ± j e*(r-r')r/Atf ( r ) dV (35.3) 

To find the transition probability, this matrix element must be 
substituted into (32.42) and multiplied by 2n/h and the number of 
final states g (E‘ = E). Since we are concerned with the probability 
of a particle occurring in a solid-angle element dQ , Eq. (28.25) 
should be additionally multiplied by dQ/(4n). Therefore 

= (35.4) 


The differential scattering cross section is equal to the scattering 
probability in unit time inside the solid-angle element, divided by 
the flux density of the incident particles, v/F = (2E/m) i ^ 2 /V. There 
fore 


do = 


j ^(p-P')r//if/ ( r ) dV 


4ji 2 /i 4 


dQ, 


(35.5) 


The matrix element appearing here now refers to the normaliza¬ 
tion of the wave functions to the unit volume, F = 1. Introducing 
the wave vectors k = p !h, k' = p'/fc, we write 


t/fc/k = j gi(k-k’)r£/ (r) d y 

d(T = ^-|^k| 2 ^ 


(35.6) 

(35.7) 


The expression (35.5) is simplified when the scattering field is 
a central field, that is, it depends only upon r. Then integration can 
be carried out over the angles in f/k'k, leaving only integration 
with respect to the radius. In defining the polar angle we choose 
the direction of k — k' as the polar axis. After that the matrix 
element reduces to the form 


£/ k , k = j e t(k-k')r[/ (r) dV 




“k'lrco* # gin ft dft 


(35.8) 
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Since sin ft dft = — d cos ft, we can integrate with respect to ft im¬ 
mediately: 


Uwl= 2tc j r 2 drU (r) 
o 


| lc-k' | r_ e -i\ k-k' | r 

i | k—k' | r 


= Tk^FrJ rC/(r)siQ( i k-k ' lr)dr 

o 


(35.9) 


As we have already said, k = k'. Therefore, the vector difference 
is easily expressed in terms of the deflection angle 0 for the particle: 

| k — k' | 2 = 2k 2 — 2 (k • k') = 2 ft 2 (1 — cos 0 ) 

= 4fc 2 sin 2 y (35.10) 

This can also be seen from a geometrical construction. 

Thus 

oo 

U ^= fc si n* 8 / 2 ) J rC7 ( r ) sin ( 2 ^sin|) dr (35.11) 

0 


Take, for example, the Thomas-Fermi potential (33.31). Then 


Substituting into (35.11) yields 


= ~ kSfim I * {x) sin ( 2kr sin t ) dr (35 - 12) 
0 

This integral can be determined numerically for different values 
of the dimensionless ratio 


(3Jt)*/3 h* 

2 7 /3ZV3 K me* 

whence, after substitution into the general equation (35.7), we 
obtain the differential scattering cross section of a fast charged 
particle in an atom. 

Let us consider the limiting case of small scattering angles, for 
which sin (0/2) is replaced by 0/2. Then (35.12) is reduced to the form 

c/ k , k = —n (^) 4/3 -^r J x^(x)dx (35.13) 

0 

For oo, the function (x) decreases approximately as x~ 8 , 
so that the integral involved in (35.13) has a finite value. This means 
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that at small angles the differential scattering cross section referred 
to a solid-angle unit, that is dcr/dQ, tends to a finite limit. As can 
be seen from (35.11), in the most general case for this the integjal 

oo 

f r*U (r) dr < oo (35.14) 

o 

must converge. Then do/dQ tends to a finite limit at infinitesimal 
scattering angles. 

Thus, the dependence of the scattering cross section on the distance 
to the scatterer in quantum mechanics is different than in classi¬ 
cal mechanics. In Section 6 it was shown that if a force does not 
identically become zero at infinite distance from the scatterer, and 
does not simply tend to zero, then every particle, however far from 
the scatterer it may pass, is slightly deflected. For that reason the 
classical expression for the differential scattering cross section for 
a small angle tends to infinity if the force of interaction of the par¬ 
ticle with the scatterer does not vanish at some finite distance r 0 
from the scatterer. 

This difference between the classical and the quantum theories 
is explained on the basis of the uncertainty principle. The uncer¬ 
tainty in the momentum of a particle passing at a distance r from 
a scatterer is A p ~ 2nh/r. If the interaction force decreases fast 
enough with distance, the change in momentum for a large impact 
parameter r may prove smaller than Ap. This condition is defined 
by the integral (35.14): when it converges, the uncertainty in the 
momentum, Ap, associated with wave diffraction is greater than the 
deflection as a result of the interaction of the particle with the scat¬ 
terer. But owing to diffraction the integrals take on finite values. 

Suppose that the second condition for the applicability of the 
Born approximation for the case of slow particles is satisfied. Then 
kr 1, since at small energies the wave number is also small. In 
the expression (35.11) we replace sin [2kr sin (0/2)] by 2kr sin (0/2). 
It turns out that if the condition (35.14) is satisfied, then the scat¬ 
tering angle is eliminated altogether from the expression for the 
scattering cross section. The scattering becomes isotropic. The same 
number of particles passes through each solid-angle element. 

Note that condition (35.14) is sufficient, though not necessary, 

for the total cross section to be finite, that is a = j do <Z oo. Even 

when the differential scattering cross section for small angles be¬ 
comes infinite, but not according to a very strong law, the total cross 
section may remain finite. 

Let us now consider the case of particle scattering in a purely 
Coulomb field. Then U = ±Ze 2 /r. The matrix element involved 
in the general expression (35.5) was calculated for a Coulomb field 
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in Section 26 (see (26.37)), but we shall find it directly from (35.9)* 

oo 

denning the integral j sin x dx thus 


lim [ e- ax sinxdx= lim- 5 r 4 -T ; 
o-oJ a+O a +l 


Then 


[ sin axdx = — 

l 


and finally 


&kk = ± 


= zb 


2nZe 2 
k sin (0/2) 

n Ze 2 

k 2 sin 2 (0/2) 


oo 

J sin ^2fcr sin-^-j dr 


(35.15) 


Substituting this into (35.7), we obtain the final expression for the 
differential cross section: 


, _ ZM dQ 
a °—4m 2 v* sin 4 (0/2) 


(35.16) 


where we have taken advantage of the fact that p = hk = mu. 
This result curiously agrees with the precise classical Rutherford 
formula (6.21). 

It turns out that Eq. (35.16) is also obtained from a precise solu¬ 
tion of the wave equation for the case of a Coulomb field in the scat¬ 
tering problem. Thus, the Rutherford formula is extended to quan¬ 
tum mechanics unchanged. 

The Born approximation in the theory of scattering by a Coulomb 
field can be regarded as the first nonvanishing term in a series ex¬ 
pansion of the exact formula in powers of Ze 2 . But since the exact 
formula does not involve powers higher than ( Ze 2 ) 2 , the result of the 
Born approximation coincided with the precise result. 

We shall now estimate the limits of applicability of the method 
under consideration for the Coulomb field. To do this, we make use 
of the first criterion, referring to large velocities. Since the product 
Ua in this case is equal to Ze 2 , we arrive at the following condition: 

/ m \ 1/2 Ua Ze 8 y . /qc An\ 

(if) —=^r < 1 ( 35 - 17) 

The quantity e 2 l(hc) = 1/137. Therefore, we write (35.17) thus: 


z c 


<1 


(35.18) 


iV v 
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But Z ~ 90 for heavy elements, so that (35.18) is not satisfied 
in general. Of course, the Rutherford formula is applicable to non- 
relativistic particles in this case too, because it is exact. 

Formula (35.16) is applicable also to the scattering of relativistic 
particles at small angles, provided m is replaced by m (1 — v 2 /c 2 )- { / 2 . 
The condition (35.18) here is not necessary, since small scattering 
angles correspond to large impact parameters, for which the Coulomb 
field represents a small disturbance. 

The General Theory of Particle Scattering in a Central Field. We 

shall now consider the particle scattering problem in its exact form. 
For this we must solve the Schrodinger equation for the motion of 
a particle in a central field. But unlike Section 29, where we deter¬ 
mined the eigenfunctions of the energy operator, here we must look 
for solutions corresponding to entirely different boundary conditions. 
Namely, at infinity the wave function must comprise a function 
corresponding to an incident plane wave, which describes the inci¬ 
dent particles, and a function corresponding to a spherical outgoing 
wave, which describes the scattered particles. Both functions are 
complex, and we must make use of expansion methods somewhat 
differing from the general methods of expanding wave functions 
in operator eigenfunctions (Sec. 25). 

Every wave function of a particle in a central field satisfies the 
Schrodinger equation 

( 35 - 19 ) 

Here, O denotes the wave function divided by r (cf. (29.14)); the 
energy E is positive, since the motion is infinite. 

We represent the solution O in the form of an expansion in eigen¬ 
functions of the angular momentum square. We select for the polar 
axis the direction of the linear momentum of the incident particles. 
We also make use of the fact that the scattering is symmetric with 
respect to the azimuth. If the particles possess no spin, or if their 
spin state is a mixture of states with spins of opposite sense in equal 
proportions (see Sec. 27), then there always exists azimuthal scatter¬ 
ing symmetry, and such symmetry will exist as long as spin-orbit 
coupling is of no consequence. 

The eigenfunctions of angular momentum with azimuth symmetry 
are Y® = P t (cos 0) (see Sec. 29). Therefore the general solution 
to Eq. (35.19) of the required form is 

'f=T- = 7-2 i? '( r ) /> ' (cos0 ) (35.20) 


32-0452 
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where i? z (r) satisfies the equation 


h 2 d*Ri 
2m dr 2 


hH ( l+i)Ri 
2mr 2 


U (r) /?, = £/?, 


(35.21) 


At large distances from the scatterer the centrifugal and potential 
energies tend to zero, so that the asymptotic form of (35.21) is 


h 2 &Ri 
2m dr 2 


ER t 


(35.22) 


The general solution to this equation is 
i?, = 4,[sin (ftr + 6,— j-) 


(35.23) 


Here, k = (2 mE) 1/2 lh. Since solution (35.23) is asymptotic, it need 
not become zero for r = 0. The term —(jiZ/ 2) is introduced to make 
the solution exact for a particle in free motion, that is, for U = 0 
(how this occurs will be shown later in this section). The term 6j, 
called the phase shift , shows by how much the phase of a particle 
moving in the field of the scatterer differs from the phase of a free 
particle. 

In order to compute 6* in explicit form we must find the solution 
to the exact equation (35.21), Which at the origin of the coordinate 
system vanishes as r z + 4 (see (29.20)), and continue it into the domain 
where the asymptotic solution (35.23) becomes valid. We shall 
consider b l to be known, but first let us determine the condition 
for its existence. 

At large distances from the scatterer the potential and centrifugal 
energies are close to zero, and the particle is almost free. Its motion 
in this case is quasi-classical. But in such an approximation (35.21) 
looks like 

~ {t f *[ 2 ™ ( E ~ U M- 51 T^®)r + T} 

To 

(r 0 is the turning point where the radicand is zero (cf. (31.40)); for 
a free particle, that is, for U (r) = 0, the solution is 

T 0 

It is apparent that in the quasi-classical approximation the dif¬ 
ference between the phases of the wave functions of a free particle 
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and a particle moving in the field of the scatterer is 

6,~i{ j\[ 2 m(£-C7<r)-'^±W)]’' 2 

r 0 

-J*[2m(£-‘ligia')]‘' 2 } (35.24) 

r 0 

If it tends to a constant value when r—►- oo, then the phase exists 
always, and not only in the quasi-classical approximation, because 
the domain of large r’s is the determining factor. Expanding the 
integrand in (35.24) in a series in U(r), we see that 6 Z is finite if 
the integral 


(35.25) 


j drU(r) 

for large r. If U (r) is a power function, then for the integral 

r 

U ( r) dr to converge it is sufficient for the exponent in U (r) = 

T o 

= alr n to be greater than unity. Thus, finite phases do not exist fpr 
the Coulomb field, but as was shown, an exact solution of the scat¬ 
tering problem for it is possible. 

Assuming the existence of 8*, let us find the factors in the asymp¬ 
totic wave functions, We note that solution (35.23) involves 
the incident and scattered waves: it is a general solution. It is not 
hard to determine the asymptotic form of the incident wave: at 
a sufficient distance from the scatterer it is a plane wave. Since we 
suppose that the particles are travelling along the polar axis, their 
wave function for large distances from the scatterer must be written 
as e ihz . Such a function is normalized to unity in unit volume 
(cf. (35.1)), therefore the flux density of the incident particles is 
represented simply as v. 

The difference yp — e ikz is thus an asymptotic expression of a scat¬ 
tered wave, provided that in place of yp we have substituted its 
value for large distances from the scatterer. But at an infinite dis¬ 
tance the scattered wave should not contain particles moving towards 
the scatterer. In other words, at infinity the function yp — e ihz should 
involve only outgoing waves of the type e ikr lr . (Note that the wave 
contains, together with time dependence, the dependence 

32 * 


f° (2m) 1 /* U (r) dr 

J [E — /i a (i + l/2) 2 /(2mr*)]V 2 

r o 

converges, which in turn depends on the convergence of 
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r -i e -iEt/h+ikr , which shows that the motion is in the direction of 
increasing r.) 

The function (35.23) tnvolves both ingoing and outgoing waves. 
If we expand e ihz in Legendre polynomials, the expansion coef¬ 
ficients dependent upon r will also correspond to both waves. We 
must so choose the constant coefficients A t that the difference^ — e ihz 
contain only the outgoing wave. 

Let us expand e ikz = e ikr cos 0 in Legendre polynomials, assum¬ 
ing r very large, since we are concerned only with the asymptotic 
form of the functions of r. Let 


e^cose = 1 ^ RV ( r ) P r (cos 0) (35.26) 

l '=0 


We multiply both sides of (35.26) by P t (cos 0) and integrate over 
sin 0 d0. In the right-hand side we obtain, from (29.10), 


jt 

2 Ry j Pi (cos 0) Pi- (cos 0) sin 0 d0 
l' o 




2ft? (r) 
2J+1 


(35.27) 


Injthe left-hand side, we integrate once by parts to get 


Jt 

j e ihr cos Qp t ( CQS QJ ^ cos Q 
0 



Jhr£ 

=v p >®- 






The integrated expression involves r in the denominator; the 
next term, if we integrate it once again, yields r 2 in the denominator; 
etc., up to the Zth derivative of P t (E), which is a constant number. 
But we must retain only the term proportional to r -1 , which is of 
importance for the asymptotic solution. We substitute the limits 
into it and make use of the fact that P t (1) = 1, P t (—1) = (—1)* 
(see Exercise 1, Section 29). 
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Comparing the integrals in the left- and right-hand sides of 
Eq. (35.26) multiplied by P t (cos 0), we obtain 

D0 /-\ 2Z+l )l e -ihr 

R l ^=— - Ik - 

= ^tl e M/2 [gt(hr—ni/2) — g—i(hr—n(/2)J (35.28) 


where we replaced (—l) z by e inl = e inl / 2 x e inl/2 . Now 
(35.23) as 


(r) = — [g i ^ r+6 Z" JlZ / 2 >_ g -i(fer+6 z -jiZ/2)j 

2i 


represent 

(35.29) 


The difference — Z?? should not involve an ingoing wave, 
hence 


a _ 2Z+1 j6,+inl/2 

L. * 


(35.30) 


Substituting this value of A t into (35.29) and forming the difference 
Ri — R°i we find the outgoing wave with angular momentum l : 

Ri — R°i= e ihr (e 2i6 i — 1) (35.31) 


If the term — jiZ/ 2 had not been introduced into (35.23), it would 
have appeared in the exponent. 

Finally, the scattered wave is represented in the form 

f = J r : /(8) (35.32) 

where 

oo 

/(8) = jj2 ( 21 + 4 ) ( g2i6 '-!) p i ( cos0 ) (35.33) 

z=o 

Let us find the expression for the differential scattering cross 
section. The radial component of a flux of particles through an infi¬ 
nitely remote sphere in a unit solid angle is (cf. (35.2)) 

-ST ('*'^®“■-1 / < 0 ) I 2 

Dividing this relation by the flux of incident particles, which is 
equal to v , we reduce the expression for the differential cross section 
to the form 

do=\f (0) | 2 dQ (35.34) 

Equations (35.33) and (35.34) are fundamental in the theory of 
scattering. 

If all the phase shifts are small in comparison with unity, the 
expression for the effective cross section transforms into the Born 
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approximation formula, though we shall not verify this. We shall 
only note the following. The convergence of the series (35.33) or 
of other series developed from it requires that lim 6* = 0. Conse¬ 
quently, for large l the evaluations made on the basis of the Born 
approximation are correct. 

Making use of the orthogonality of the Legendre polynomials, we 
now^find the total scattering cross section 

a = j da = J | / (0) | 2 d& 

= -W S dQ | 2 ( 2Z + 0 (e™ 1 ~ !) p i ( cos 0) f 

= 1ST 2 ( 2Z + !) (2i' +1) (e™ 1 ~ 1) (e" 2iV ~ 1) 

Z', z 

X j Pi (cos 0) Pi> (cos 0) d£l 

=TST 2 ( 2Z + 4) ( 2Z ' +1) (e 2i6 '-l) 4r]rr 5 «' 

l', l 

= ^-S( 2Z + 1 ) \e 2i&l -i\ z (35.35) 

l 

Thus, the total cross section is separated into a sum of partial 
cross sections , each of which corresponds to the scattering of particles 
having a specific angular momentum (“arm” or impact parameter 
in classical terms): 

l 

where 

a l = ^-(2l + l)(e 2i6l -i)(e~ 2i6l -i) 

= (21 + 1 ) (2 — e Zi6 i — e~ 2i6 i) 

= -i- (21 +1) 2 (1 — cos 26,) = ijf- (2Z +1) sin 2 6, (35.36) 

The maximum scattering cross section for a particle with a given 
angular momentum l is thus 

(ct,Ux = ^(2Z + 1) (35.37) 

The relation between the angular momentum and the impact para¬ 
meter (“arm”) is conveniently written in the quasi-classical approx¬ 
imation, and as was pointed out in Section 33 (cf. (33.38)), instead 
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of [l (l + 1)1 1/2 we should take l + 1/2 for the absolute value of 
the angular momentum. Then 

1+ 1/2 = ftp (35.38) 

where p is the impact parameter. 

Thus, the partial cross section (35.36) corresponds to a ring on 
the impact plane. But there is also a difference, due to the wave 
character of motion in quantum mechanics. In the classical scatter¬ 
ing) theory (Sec. 6), the differential scattering cross section was 

^^classical = 2jip dp 


From (35.38) p = (l + l/2)/ft, and dp, which corresponds to an 
increase in l by unity, is 1 /ft. Therefore the classical expression 
would have been 

^classical ~ (2/ 1) 

that is, one-quarter the maximum quantum cross section. 

A result similar to the quantum result is obtained in wave optics 
in the separation of a plane wave front into circular Fresnel zones. 
When all the zones but one are closed by a screen, the amplitude 
of the wave passing through the ring aperture is twice as great as 
through an equal area of unobstructed wave surface. But having 
the strict result (35.37), there is no need to continue to develop the 
analogy with approximate zone construction. 

The function / (0) is called the scattering amplitude . Its component 
corresponding to the angular momentum l is the partial scattering 
amplitude f x (0). The partial cross section o x is expressed in terms 
of it. 

Of interest is an examination of the behaviour of the scattering 
amplitude f x (0) when the scattering angle 0 tends to zero. Remem¬ 
bering that P x (1) = 1, we find 

u (°)=4r (2Z+1) ( e2i6l_1) ( 35 - 39 > 

Let us find the imaginary part of this expression: 

Im fi (0) = —^ (21 +1) (cos 28 z — 1) = —j~ sin 2 (35.40) 

Comparing it with (35.36), we obtain the relationship between the 
scattering amplitude for angle zero and the partial scattering cross 
section: 

lm/ t (0) = ^cx t (35.41a) 

Since the proportionality factor does not involve Z, the relation 
between the total zero-angle scattering amplitude and the total 
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cross section is the same: 

Im/(0) = JLo (35.416) 

Separation of the total cross section into partial cross sections is 
especially useful when one or several of the partial amplitudes are 
significant. 

For example, in the case of short-range forces, mainly particles 
in the s state, that is, with zero angular momentum, are scattered. 
The wave functions of all other states vanish in the region of action 
of the forces. But if only one phase is other than zero (8 0 ¥= 0, 
8^o = 0), then only the sth scattering amplitude, which in the 
expansion is the factor of the zero Legendre polynomial, P 0 = 1, 
is not zero. Hence, in the case of short-range forces the scattering 
is spherically symmetrical. This was already discussed in Section 6. 
But in the classical theory there always remains a sharp maximum 
of the differential scattering cross section in the direction of the 
incident beam. In quantum theory the sth cross section is strictly 
isotropic. 


EXERCISES 

1. Find the scattering cross section of fast particles by hydrogen atoms 
in the ground state, assuming that the state does not change. 

Solution. The wave function for the ground state of a hydrogen atom 
with n = 1, l = 0 is 

The coefficient B is found from the normalization condition 

o 

whence 

The potential energy of the charge e in the field of the atom is 

a—i. + f fWfr.'V 

r TJ 1 r—r' I 

The first term in t/^, was found in the text. It is equal to 
Tie 2 


k 2 sin 2 (0/2) 
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We find the second term in the following way: 

e 2 r e i(k-k')r dV f to 2 (r') dV' 

J J |r —r' | 


C -/i, ^ , C J(k-k')(r-r') 

= * 2 ) Vo (r') e ,(k_k )r dV‘ J |r _ r<[ dV 


In the last integral it is necessary to take the origin at the point r'. Then 
it reduces to the same form as the preceding one: 

# i(k-k')r 




dV = 


k 2 sin 2 (0/2) 


Hence 


u ne * 

A 2 sin 2 (0/2) 


(l— j tg(r')e i(k_k,)r '<iF*) 


The quantity inside the brackets is called the screening factor. Evaluating 
it in the same manner as (35.8), we obtain 

oo 

j Vo (!■') « i(k ~ k ' )r ' dV' = j r> 2 (r') sin [2 hr' sin (0/2)] dr' 

0 

The integral reduces to the form 


oo oo 

^ x sin ax e~bx dx - —J sin ax e~t> x dx 


2ab 


db a 2 + 6 2 (a 2 + 6 2 ) 2 

Here a = 2k sin (0/2), b = 2 me 2 /h 2 , so that the screening factor is 
hv \ 2 . . 0 1-2 


1 




It was assumed in the last formula that the scattered particle is an 
electron, that is, it has the same mass and charge. Strictly speaking, we 
should have formed a function which is antisymmetric to the function of 
the atomic electron; this we did not do. The final formula for the elastic 
scattering cross section differs from (35.16) by the square of the screening 
factor. We note that this factor is correctly obtained only in the Born approx¬ 
imation, in contrast to the Rutherford formula, which is exact. 

For 0=0, the cross section turns out to be finite, because 0=0 cor¬ 
responds to large impact parameters when the nuclear charge is screened 
by the charge of an electron. 

2. Calculate the effective scattering cross section for a particle scat¬ 
tered on an opaque sphere of radius a much smaller than X/(2 ji) = % = 1 Ik, 
so that only s scattering is significant. 

Solution. From the boundary condition (28.1), on the surface of an 
opaque sphere the wave function vanishes. Hence, solution (35.23) has the 
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form 

Rq = A 0 sin k (r — a) 

From this 6 0 = — ka, and 

a = -4?- sin 2 /ca 
k 2 

But by definition ka < 1, so that sin ka ^ ka , and a = 4 ji a 2 , that is, the 
scattering radius is twice the radius of the sphere. The cross section in the 
quantum theory is four times larger than in the classical theory (see Exer¬ 
cise 1, Section 6). 

3. Find the scattering cross section for particles scattered by a potential 
well of depth close to U CT , defined by Eq. (28.44). The first bound state 
appears in the well at V = U CT . It is assumed that the energy of the incident 
particle is much smaller than the difference | U CT — U 0 |, and the radius 
of the well is much smaller than the wavelength of the incident particle 
divided by 2 ji. Express the cross section in terms of the energy of the bound 
state. 

Solution. Displace the potential energy curve (Section 28, Figure 31) 
along the vertical axis so as to obtain U = 0 for r > a, that is, at infinite 
distance from the origin we put U (r) zero. The condition for matching the 
wave function at r = a has the form 

k cos (ka + 6o)_x cos xa 

sin(/ca + 6 0 ) sin xa 

-wnere 

* = i-(2m£)l/2, x=i-[ 2 m(£+|r/ 0 l )] 1 / 2 


Neglecting ka , we obtain the expression for the scattering cross section: 


4ji * 4ji 4ji 

~jfc2“ Sin ° 0- /c 2 (1 + cot 2 S 0 )~*2 + >t 2 cot 2 xa 


Further, using the notation introduced in Section 28, we find the ap¬ 
proximate expression for x: 


x 


(2m | U e x I) 1 / 2 / A , Up — Urr , E \ 
h V ^ 2U CI "I - 2U CI ) 


By definition, I/ 0 — I^cr ^ E, so that 


* a ~-r( 1 + ir)> 

From this 


x cot xa « -— X 

La 


Jlv 

~ 


~ 8 a y 


x 2 cot 2 xa = 


x 4 (Uq-Uct ) 2 


64 a 2 


U 2 

cr 


2ro I 8 I _ i.» 

=*0 
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Finally, we obtain the scattering cross section in the form 

_ 4ji 

* 2 + *o 


Note that nowhere in developing this equation did we make use of the 
fact that | U 0 \ < | U cr |. This formula also holds for | U 0 | > | U CT | f 
when there is no level in the well, but it would have appeared for a slight 
deepening of the well. In such cases we speak of “virtual levels”. Such a case 
of virtual level is observed in the scattering of neutrons on protons with 
antiparallel spins, which is established by comparing the scattering cross 
section of the neutrons on ortho- and parahydrogen. 

4. Express the cross section in terms of the scattering amplitude for 
like particles mutually scattered on one another. Consider the case of pai> 
tides with zero and 1/2 spin. 

Solution. Consider the motion in the particles* centre-of-mass reference 
frame. In the case of zero spin, the wave function must be symmetrical with 
respect to an exchange of the particles corresponding to a substitution of —0 
for 0. Hence 

do= | / (0) + / (-0) | 2 dQ 

The normalization factor 1 1 ]/~2 is not required here, since there are two 
particles and they are indistinguishable in the reaction. 

In the case of 1/2 spin we have three ortho-states and one para-state. 
Hence, the probability of a collision in the ortho-state is 3/4, in the para- 
state it is 1/4, and the effective cross section is 

da = { /(0)—/ ( — 0 ) l 2 +4 l/(0) + /(-e)|a } dQ 

= [l/(9)l 2 +|/(- 0)l 2 —g-/(0)/*(-0)—4-/*(0)/(—0)] da 

In both cases there appears an “interference” term //*, which does not 
appear in the analogous classical problem (cf. (6.22)). In the limiting transi¬ 
tion to classical mechanics the interference term does not tend to zero, because 
it retains the final value of the scattering amplitude /: a is expressed in 
terms of it. Since in the limit the wavelength X tends to zero, //* becomes 
a very rapidly oscillating function, the mean of which over an infinitesimal 
interval of angles tends to zero. The limiting transition from wave optics 
to geometrical optics at the boundary of a shadow takes place in approxi¬ 
mately the same way. 
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THE QUANTUM THEORY OF RADIATION 


An electromagnetic field in vacuum can be treated as a mechanical 
system (this was shown in Section 15). It is characterized by a Lag- 
rangian, action, etc. It is therefore legitimate to pose the problem 
of quantizing this system, that is, applying quantum mechanics 
to it. 

The basic distinction between electrodynamics and the mechanics 
of mass points is that the degrees of freedom of an electromagnetic 
field are distributed continuously: to define a field at a given instant, 
its value at every spatial point must be defined. In this sense electro¬ 
dynamics resembles the mechanics of a fluid or an elastic body, if 
they are treated as continuous media, disregarding the atomic 
structure of matter. The coordinates of spatial points as it were 
number the degrees of freedom of the field, while the values of the 
potential amplitudes define the generalized coordinates. 

The electromagnetic field coordinates thus defined are not mutually 
independent. Indeed, the electromagnetic field equations involve 
derivatives with respect to the coordinates, that is, differences be¬ 
tween values of the field at infinitely close points. In this sense the 
field equations resemble the equations for coupled oscillations: 
they are linear and involve several generalized coordinates taken 
for infinitely close points in space. The equations of coupled oscil¬ 
lations reduce to normal coordinates, which are mutually indepen¬ 
dent (Sec. 7). The same can be done with the equations of electro¬ 
dynamics, thereby separating the dependent variables. This greatly 
simplifies the application of quantum mechanics to radiation. 

Here we find vivid proof of the generality of the methods of analy¬ 
tical mechanics: they make it possible to so define the generalized 
coordinates and momenta as to make it possible to apply the quan¬ 
tum laws uniquely. 

Electromagnetic Field in a Closed Volume. First of all, we must 
visualize an electromagnetic field as a closed system, since quantum 
mechanics is most conveniently applied precisely to such systems. 
We may assume, for example, that an electromagnetic field is con¬ 
tained in a box with mirror reflecting walls. The normal components 
of the Poynting vector vanish on the walls of such an imaginary 
box (x = 0 or x = a ly y = 0 or y = a 2 , z = 0 or z = a 3 ). 

Let us fill all space with such boxes and assume that the field 
has the same value at corresponding points in each box. Such a field 
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is periodic in all three spatial directions: 

A (x, y, z) = A (x + a 2 , y, z) 

= A (x, y + a 2 , z) = A (x, y, z + a 3 ) (36.1) 

But if we remove the reflecting walls, the field remains periodic 
anyway, because all its values are displaced at all points of space 
with the same fundamental velocity c. It is therefore sufficient to 
impose periodicity conditions (36.1) on the field and simply discard 
the walls. This appreciably simplifies the calculations, and the 
final results cannot depend on this or that supplementary device. 

A solution of the equations describing an electromagnetic field 
in vacuum was found in Section 18. Since the periodicity condition 
is imposed on the field, it can be expanded in a Fourier series in all 
three dimensions, that is, it can be represented in terms of separate 
harmonic components. These components must be real quantities. 
Expressing them explicitly only as functions of the coordinates, 
we write, by analogy with (18.25), 

A (k, r) = A k e ikr + A k e“ ikr (36.2) 

for one harmonic component. The reality of the vector potential A 
is apparent from this notation. 

Since we are considering the field in vacuum (in the absence of 
charges), the scalar potential can be put equal to zero, and the 
Lorentz condition (12.42) for the vector potential is reduced to the 
form div A = 0, that is (see (11.27)) 

div A (k, r) = div (A k e ikr ) + div (A k e -ikr ) 

= (A k • grad e i]LT ) + (Aj£ • grad e~ ikr ) 

= i (k • A k ) e ikT — i (k - A{) e " ikr = 0 

For this equation to hold for all r’s, the coefficient of each exponent 
must be equal to zero. In other words, vectors Ak and A£ are per¬ 
pendicular to the wave vector k: 


(k.A k ) = 0, (k • A k ) = 0 (36.3) 

For every k there exist two mutually perpendicular vectors 
Ak (a =M, 2) corresponding to two possible polarizations of the 
wave. It is natural to choose Ajc l) and Ak 2) mutually perpendicular. 
Any vector in a plane perpendicular to k can be resolved along 
At 1) and A|?>. 

Let us now apply the periodicity condition (36.1) separately 
to each component of (36.2). We obtain 

A k e i(fe ** + V + *z 2) = A k e i[fe * (x+ai)+ 

_ A k 6*^:)c x ^~kj / (l/+ 0 *M'k z z ] _ 
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whence it follows that 

e xk x a ' = e ih y a * = e ih z a * = 1 


Therefore, the wave vector components are 


k x =k y — t fc z = -^p- (36.4) 

with n x , n 2 , and n 3 being integers of any sign. 

It follows that every harmonic oscillation is given by three inte¬ 
gers n ly n 2 , and n 3 , and its polarization a, which takes on two values. 
The generalized coordinate is, as stated before, A^r^n,* The number 
of such coordinates is infinite, but it at least constitutes a countable 
set, not a continuous assembly similar to the assembly of all spatial 
points. 

This is the basic simplification introduced by the periodicity 
condition. Of course, it pursues only mathematical convenience: 
the basic periods a 2 , a 2 , a 3 are not involved in any final results. 

An electromagnetic field is defined if the amplitudes of its oscil¬ 
lations for all values of n x , n 2 , n 3 , and a are known. Since the equa¬ 
tions of electrodynamics are linear, their general solution is equal 
to the sum of the partial solutions (36.2): 

A (r) = 2 At(r)=f2 (Afe** + Af«-«*) (36.5) 

k, a k, a v 

This is the required Fourier expansion. 


Energy of Field. We shall show that the expansion (36.5) provides 
all that is necessary to develop the normal coordinates. For this 
we must express the energy of the field in terms of Ar. An electric 
field is calculated according to the general equation (12.35), which 
for cp = 0 yields 

E= —i- *A = (Ate ikr + AtV ikr ) (36.6) 

k, a 

and for a magnetic field 

H = curl A =2 l(g ra d e ikr X A k) + (grad <?- ikr X A£)] 

k, o 

= t 2 [(kXA2)e ikr -(kXADe-* r l (36.7) 

k, u 

Let us now calculate the field’s energy, which from (45.24c) is 
E = ±- j (|E|* + |H|*)dF (36.8) 
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For | E | 2 we have the following sum over k, k\ a, a': 

| E| 2 = 2 -V(AgA^c*O t +k')r + AgA?*e« k - k ')' 

k, k', o,ia' 

+ Af^k'e- i(k - k ' )r + ArAk-e-''( k + k, ) r ) (36.9) 


Integrating | E | 2 over the volume, it is useful to put the integral 
under the summation sign. Then each integral separates into a prod¬ 
uct of three integrals of the following form: 


j e i(h x +h * )x dx= j e 2 " i5C(ni+n i )/0 *da: 
0 0 

°1 [e _ 1 —1} 

2 + 


0 


(36.10) 


If n x + n\ = 0, this integral is equal to a v Therefore, the triple 
integral reduces to the following expression: 

\ e^+^dV = a 1 a 2 a 3 6 ni _ n j6 ri2 -n'8n 3 -n' 3 = I / 6k-k' (36.11) 

Hence, the double sum over k and k' in j* | E | 2 dV reduces to a single 

sum; in the terms involving the product AkAif/ we should replace 
k' by —k, and where we have AkA£'*, k' must be replaced simply 
by k, because the factor of A^* is e~ ik ' r . Thus 

J |E|=W = X 2 (A£Ar + A?AS' 

k, a, a' 


+ A2Al'k + A2*Alk) (36.12) 

But the rectors A k and Aik , A k and A k * are perpendicular, if 
a =t^= o'. Hence, instead of the double sum over a and o' there also 
remains the single sum over a: 

j IE I 2 dV = -J- 2 (2AgAS* + AgA® k + A2*Al* k ) (36.13) 

k, a 

Equation (36.11) should also be used to calculate the v integral 
of the square of the magnetic field. But if k' = —k, the* product 
k' X A!L # k' is replaced by —k X A° Therefore 

J IH p dV = V 2 I(k x A2)(k X Aik) + (k x A2*)(k X Al' k * ) 

+ 2 (k X A2) (k X A2 *)] (36.14) 
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The vector products are expressed according to the known formulas: 

(k X AS) (k X AS**) = RASAS'* —(k-AS) (k.AS'*) 

= k»ASAS* (36.15) 

where the transverse condition (36.3) has been used. For a ^ a' 
the expression (36.15) becomes zero. Consequently 

j I H \ 2 dV = V 2 P(2ASAS* + AS*A"* k + ASA® k ) (36.16) 

k, a 

Let us show that if A£ and A£* are considered to be quantum 
operators and expressed in terms of the operators of position and 
momentum of a linear harmonic oscillator according to the formulas 



(36.17a) 

(36.176) 


then the electromagnetic field energy (36.8) is reduced to the sum 
of the energy operators of linear independent harmonic oscillators. 
The quantity e£ in equations (36.17a) and (36.176) is the unit pola¬ 
rization vector of the electromagnetic wave; cok = ck. 

If and P £ are the operators of position and momentum of a li¬ 
near harmonic oscillator, they satisfy the quantum equations of 
motion (27.18) and (27.19), in which we put m = 1: 


Ql = Pl (36.18) 

Pi= —a >£<?k (36.19a) 

Then from equations (36.17a) and (36.176) it follows that 



= icok 



e£ = — i(o kAk 





(36.196) 


(36.19c) 
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If we substitute these expressions into (36.13), the last two terms 
in the parentheses yield 

A * (J A (7 , 4 a* 4 a* 2/aO\0 I a(7*4(J*v 

kA_k At —0)^ -j - At A-t) 

When substituted into the total energy formula, these two terms 
from the integral j | E | 2 dV cancel out with the last two terms 

from J | H | 2 dV, that is, from (36.14). The operators A£ and A£* 
do not commute, so to pass from the classical to the quantum expres¬ 
sion for energy we must replace A£A£* by (A£A£*+A£*A£)/2. 
Substituting (36.17a) and (36.176) into it, we obtain 


8:rc 


A * a 1 a* 

k A k 


K*K 


2 ci 


+ A 2 


a£a£* + a£*a£ 


2 


(36.20) 


that is, the Hamiltonian of a linear harmonic oscillator of unit mass. 
If we do not make the product A£A£* symmetrical, the energy 

acquires a constant term which is immaterial for the subsequent 
discourse. But the form (36.20) is the standard one for the Hamilton¬ 
ian of an oscillator. By reducing the Hamiltonian to such a form 
we have justified equations (36.18) and (36.19a), which are derived 
from the Hamiltonian (36.20) as quantum equations of motion. 


Quanta. Thus, the Hamiltonian of an electromagnetic field in 
which there are no charges is expressed as the sum of the Hamilton¬ 
ians of linear independent harmonic oscillators, each of which 
corresponds to one value of the wave vector k and to one polariza¬ 
tion a. To these oscillators can be applied all the quantization rules 
obtained in Section 27. In other words, they can be described in the 
representation in which the energy of each oscillator is diagonal. 

The energy eigenvalue of a separate oscillator is determined from 
Eqs. (27.23 a) and (27.28 b): 

£2 = /MD k (^ka + 4-) (36.21) 

Here, the term hid^/2 corresponds to the ground state of the oscil¬ 
lator, the number N* 0 gives the number of quanta of frequency o, 
wave vector k, and polarization a in the field. 

The momentum of an electromagnetic field can be found (Exer¬ 
cise 2) similarly to the way the energy was calculated. It then turns 

33-0452 
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out that to the oscillator k, o there corresponds the momentum 
p£ = hk (N^o + 1/2), confirming the ratio between the energy and 
momentum of a quantum adopted before. Thus, quanta are not an 
additional hypothesis concerning the properties of an electromagnet¬ 
ic field, but a corollary of the quantum laws of motion as applied 
to fields. It is wrong to imagine that quanta “explain” the nature of 
light. One could claim with equal justification that the formula 
E n = feo) (n -f- 1/2) explains the nature of vibrational motion in 
general. 

Quanta should certainly not be treated as the fruit of sundry math¬ 
ematical exercises that result in Eq. (36.21). A quantum is a real, 
existing particle, the same as an electron. For example, in the scat¬ 
tering of X-ray quanta on electrons, the energy ha and momentum hk 
of each separate quantum are involved in the general law of conser¬ 
vation of energy and momentum in collisions in the same way as 
for any other particle. In scattering the frequency of a quantum 
decreases in proportion to its energy. Like the electron, which pos¬ 
sesses an intrinsic degree of freedom (spin), the quantum possesses 
a polarization degree of freedom. But it cannot be identified with 
1/2 spin, because the quantum is described with the help of a vector 
quantity—the vector potential—while 1/2 spin is described by spin¬ 
ors (Sec. 30). 

The latter is due to the fact that the limiting transition to the 
classical theory for the quantum and for the electron occurs in 
entirely different ways. There is nothing in the limiting transition 
to classical mechanics that corresponds to the quantum: its energy 
and momentum, ha and hk , tend to zero when h -+■ 0. The situation 
with the electron is different, and its momentum and energy pass 
from quantum to classical quantities. 

Matters are different again with the wave properties of motion. 
In the limiting transition to the classical theory, the energy of each 
quantum is considered to be infinitely small, and their number 
infinitely large, so that the wave amplitude remains finite. 

Because of their half-integral spin, electrons are subject to Pauli’s 
exclusion principle: no two electrons can be in the same state. Accord¬ 
ingly, in the classical transition there is nothing that corresponds 
to the wave function of the electron (and hence to all the wave prop¬ 
erties of its motion). They appear only in quantum theory. 

It should be noted that the amplitude of an electromagnetic wave 
should not be identified with the wave function of a quantum, inter¬ 
preting it in the same sense as the probability amplitude of an elec¬ 
tron. The square of the wave amplitude's used to express the energy 
density of a field, not the quantum density. If we wished to pass to 
quantum density, the quantity would have to be divided by the 
corresponding value of co for each frequency. This would give the 
expression for quantum density in the momentum representation* 
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But in the coordinate representation, which is obtained from the 
momentum representation by means of a Fourier transformation, 
quantum density cannot be expressed in terms of the square of the 
wave amplitude or its derivatives, because in passing to the coor¬ 
dinate representation the frequency division operator does not 
yield the 6 function or its derivatives. 

Let us now consider the state of a field when all = 0. Appar¬ 
ently, this is the ground state, which is also conventionally called 
a vacuum (with respect to the quanta). In it all the separate oscilla¬ 
tors used to represent the field are in the ground state. But as was 
shown in Section 28, the coordinate of a ground-state oscillator is 
not zero: it does not have a strictly defined value. The probability 
of a certain value of the coordinate, is described by the square 

of the oscillator wave function, t|)J(@£). (This wave function has 

no relation to quanta: it states the probability of certain field 
values!) 

Thus, even in the ground state (vacuum), in the absence of 
charges, an electromagnetic field does not vanish. Its amplitude 
A is expressed in terms of the oscillator coordinates (?£. This 
leads to observable effects, which will be discussed in the next 
section. 

Interaction of an Electromagnetic Field with a Charged Particle. 
We shall now examine radiation transitions, that is, processes of 
quantum emission and absorption in interactions with charged 
particles. For that we must find the operator describing the interac¬ 
tion. The corresponding classical quantity for a separate charge is 
obtained from (17.32) for qp = 0: 

< 36 - 22 > 

It corresponds to a nonrelativistic approximation with respect to 
a moving charge: its velocity is assumed to be small in comparison 
with the speed of light. 

To pass on to operators we must, as usual, replace p by (ft/£)V and 
substitute in place of the vector potential the operator expression 
corresponding to (36.5), replacing in it the amplitudes of separate 
harmonic waves by the operators (36.196) and (36.19c). We agreed 
to represent these operators in such a way as to have all the numbers 
of quanta in all states diagonal. For that we must substitute instead 

of the operators (?£ and P £ their matrix expressions (27.28a). 

We substitute the operators (36.196) and (36.19c) into the vector 
potential expansion (36.5), assuming for the sake of simplicity 
that the volume is equal to unity, since it is in any case eliminated 

33 * 
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from the final formulas: 

A=2(*c 2 ) 1/2 e£ 

k, a 


(<?s- 


iPl 


pikr 


+ «- 


iPl 


-ikr 


(36.23) 


Consequently, we have to develop the matrices of $£ ipy& 
and (>£ — iPp co k . 

With the help of the matrices (27.28a) we obtain 





0 1 

0 

0 


A* , ^k _ 

/ 2 h 
\ co k 

| 1/2 

0 0 

0 0 

/ 2 0 

0 / 3 




0 0 

0 

0 





f° 

0 

0 

• • • j 

0 .. 




1 

0 

0 

0 .. 


/ 2 h 

\ 1/2 

0/2 

0 

0 .. 

co k 

\ COjj. 

/ 

0 

0 

/3 

0 .. 


(36.24) 


It is convenient to separate out the matrices themselves, which 
are denoted as follows: 


®ko 


0 1 0 0 0 ' 
0 0 1/2 0 0 ... 

0 0 0 /3 0 ... 

0 0 0 0 0 ... 


a ko == 


0 0 0 0 ... 1 
10 0 0 ... 
0/2 0 0 ... 

0 0 /3 0 ... 


(36.25) 


In the matrix element defining the transition probability a row 
corresponds to the initial state of the system, a column to the final 
state (see (32.42)). Thus, the matrix elements in ai G correspond to 
transitions in which the number of quanta increases by one, and a^ G 
corresponds to transitions in which the number of quanta decreases 
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by one. Accordingly, a ka is termed a creation operator, and a ka an 
annihilation operator. 

If in some state there are A ka quanta, the matrix element of the 
creation operator is proportional to (A ka + 1) 1/2 , while that of the 
annihilation operator is proportional to (Aka) 12 - The transition 
probability involves the square of the matrix element, and the emis¬ 
sion and absorption probabilities involve A ka + 1 and A k0 - 

Consequently, even in the absence of quanta of a given state in the 
electromagnetic field (A k0 = 0) they may be emitted by the radiat¬ 
ing system. This is known as spontaneous emission. The component 
proportional to A ka in the emission probability describes stimulated 
emission. The limiting transition to the classical theory corresponds 
to A ka —►- oo, so that the spontaneous part of quantum emission 
becomes negligibly small. 

Let us now determine the probability of the sjftntaneous emission 
of a quantum by a certain system of charges. If p in (36.22) is inter¬ 
preted as an operator, the question may arise as to how to write it 
with respect to a vector potential: according to (36.23), the latter 
depends upon position and, hence, is in the most general case non- 
commutative with p. But thanks to the transverse condition, it 
turns out that the order of p and A is immaterial. Indeed, since 
div A = 0, 

pAi|) = ^ div A j \|) -f- Ap\|? = Ap\f 

Thus, the matrix element of the transition of a radiating system 
of charges from a state with the wave function \|) m , in which it emits 
a quantum /io) k with the wave vector k and polarization a, to a state 
with the wave function \|) n , is 

M'nhVto m o|= ~ (^) 1/2 J ( e £p) e - ikr tynJdV (36.26) 

In (36.26) to the matrix element of a ka there corresponds the ele¬ 
ment (a ka ) 10 , equal to unity. The other factors in (36.26) are found 
from (36.22)-(36.24). 

The transition matrix element (36.26) should be substituted into 
the general equation (32.42), assuming the energy of the emitted 
quantum to be equal to the energy difference of the radiating system: 

/*G) k = E m —E n == feo) mn (36.27) 

Equation (32.42) also involves the “weight” of the final state of 
the system, that is, the number of states of the electromagnetic 
field per unit energy interval for the case of one quantum with the 
given wave vector k in the field. This quantity is easily found from 
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(28.23) , which was developed for the number of oscillations ot any 
nature in a certain volume V (which we put equal to unity). Further, 
we replace dp x by h dk x , etc., and in addition go over to spherical 
coordinates in the wave vector space. Then we obtain from Eq. 

(28.23) 

dNh = J™0L (36.28) 

Here, we must also replace k by o)/c and divide it by the energy 
differential h dec. Thus, we obtain the probability of the emission, 
in unit time, of a quantum of frequency w = c | k | and polariza¬ 
tion a in the direction of k: 


dW 

dt 


2n 

~T 


w- 


n ha)^, m 0 


[2 (JO* dQ 

I (2xt)3 C 3h 



2 e 2 ^ dQ 
2 ji he* 


(36.29) 


The energy emitted in unit time is obtained from (36.29) by multi¬ 
plying by few. 


The Absorption Coefficient. Let a plane-parallel flux of quanta 
be impinging on a certain system analogous to the one capable of 
quantum emission, so that in unit time one quantum crosses a unit 
area in a unit frequency interval. We must determine the probability 
of the absorption of a quantum from that beam in unit time. 

This problem is similar to the determination of the scattering 
cross section; the difference is that in the latter case all the particles 
are assumed to have exactly the same energies, while quanta are not 
assumed in advance to be strictly monochromatic. 14 They are dis¬ 
tributed over a certain spectral interval Aw so that h Aw A E, 
where A E is the uncertainty in the energy of the absorbing system 
due to interaction with the electromagnetic field. In other words, 
the necessary relationship E m — E n = few is satisfied somewhere 
within the energy interval ftAw. Thus stated, the problem corresponds 
to real observations of absorption lines in a continuous spectrum. 

The same method may be used to compute the absorption probabil¬ 
ity as was used to develop the emission probability formula (36.29). 
In the present case, however, the transition is not from a discrete 
spectrum to a continuous spectrum (emission) but the reverse. In 


14 In particle scattering the energy conservation law holds between two 
continuous spectrum states, initial and final. In the absorption of a quantum, 
the final state has a discrete spectrum. To satisfy the energy conservation law 
the spectrum of the initial states must be assumed continuous. Thanks to absorp¬ 
tion they have a finite lifetime and cannot have an exact energy value. 
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absorption the spectrum of initial states is continuous, and, for 
example, the quanta may be assumed to be distributed over the 
frequencies in such a way that in the frequency interval do there are 
/ (o) do quanta impinging on the absorbing system. The transition 
probability should be averaged over both the spectrum of the incident 
quanta and their direction and polarization. In all other respects 
the computations of the transition probability are similar to the 
development of Eq. (32.38). Therefore the absorption probability 
differs from the emission probability in the following factors: 

(i) Instead of the “weight” factor co 2 /(2jtc) 3 the absorption prob¬ 
ability involves the spectral density of the initial state I (o), 
where he o = E m — E n . Furthermore, averaging over the direction 
of the incident quanta yields the factor (4jt) _1 (that is, the result 
is divided by the magnitude of the total solid angle), and averaging 
over the polarization yields the factor 1/2. 

(ii) Since the absorption probability is calculated for an incident 
quantum flux equal to l/(cm 2 s), the required expression must be 
divided by c (because we put V = 1). 

(iii) In accordance with (36.25), the absorption probability 
involves the factor Aka instead of Aka + 1 in t he emission pro¬ 
bability. Indices m and n in the matrix element are interchanged, 
but by virtue of the Hermiticity of p, the square of the modulus of 
the matrix element is the same. 

The number of quanta, Aka, is replaced according to the spectral 
formula by /(o) dco, after which /(co) must be put equal to unity. 
Since the absorption coefficient is determined as the probability of 
absorption for the spectral density I (o) equal to unity, we finally 
find that it differs from the emission coefficient in that it involves 
l/(8jtc) instead of the factor co 2 /(8jt 3 c 3 ). 

If instead of the emission probability we take the energy emitted 
in unit time for Aka = 0, its ratio to the absorption coefficient 
when there is one quantum in the field (Aka = 1) is equal to 
hu> 3 /(n 2 c 3 ). This expression was obtained by Einstein in 1916, using 
statistical methods. 

The Dipole Approximation. The expression (36.29) is written for 
the most general case. Let us now apply it to the emission of visible 
light by an atomic system. The wavelength of visible light is approx¬ 
imately 0.5 X 10 -4 cm, and the dimensions of an atom are about 
10~ 8 cm, the order of the atomic unit. Hence, over a length of the 
order of atomic dimensions the phase of an electromagnetic wave 
remains almost unchanged. Therefore, for some point of the atom, 
for example, the nucleus (for r = r 0 ), the factor e ikT in (36.29) can 
be taken outside the integral. In the integrand there remains ^(r-r 0 ). 
But since the exponent is of the order of one ten-thousandth, the 
exponential can be replaced by unity if the integral does not vanish 
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in such a substitution. The latter possibility will be discussed later. 
But first we shall show that the integral does not vanish if we neglect 
the change in wave phase within the atom. Then we find 


dW 

dt 


eh o k dQ 
I K'Pnm) I 2nhc^m 2 


(36.30) 


But the matrix element of momentum is related to the matrix 
element of position as follows: 


P x n m — TMCnm — tlTi(ti nm X nm 

(see (27.13)). If we also take into account that er is the dipole moment 
of the system, d, we arrive at the formula 


dW 

dt 


K-d nm ) 


(til dQ 

2 K 

* 2nhc$ 


(36.31) 


By neglecting the change in wave phase over the dimensions of 
the radiating system we in effect neglected the retardation effects. 
Therefore, the radiation probability depends on the change in dipole 
moment in the same way as it does in the classical radiation theory 
(Sec. 20). 

Suppose now that we have to find the probability of the emission 
of a quantum with any polarization, not the given polarization gf, 
but with a given direction k. We draw a plane through vectors k 
and d nm and assume that one polarization (a = 1) corresponds to 
this plane, and another (a = 2) corresponds to a plane perpendicular 
to it. Then eic 2) d nm = 0, and ek l) d nm = d nm sin 0, where 0 is the 
angle between k and d nm . 

To find the total energy radiated in unit time we must integrate 
the obtained expression over the solid angle Q. Then after multi¬ 
plying by h(ti we get 


dE I dnm | 2 f sin 2 0 dQ 4 | d nm | 2 co^ m 
dt - ^ J 2k “ 3 c3 


(36.32) 


In comparison with the classical formula (20.28), the quantum 
equation involves an extra 2 (the factor 4/3 instead of 2/3). This is 
explained as follows. We represent the classical dipole moment 
varying according to the harmonic law as 


d = d 1 e i0) *-fdfe“ i0) *, d= —oo 2 (d^ 1 ®* + dfe~ i0) *) 

The terms and dte~ i(S)t depend upon time the same as the 

matrix elements d mn and d nm . We form the time average of (d) 2 . 
The terms involving e 2i0)< and e~ 2i<Sit are eliminated in the averaging, 
leaving 


((d) 2 ) = co 4 2d 1 df = 2co 4 1 d t | 2 
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The quantum formula corresponds to the mean radiation intensity 
feo) ( dW/dt ), so that the 2 is associated with the time averaging of 
the square of the dipole moment. 

Equation (36.32) confirms what was said in Section 21 about the 
stability of the atom following from quantum theory: radiation is 
always associated with an atom’s transition from one state to 
another, but no states exist in which the energy would be less than the 
ground state, hence an atom may remain in the ground state in¬ 
definitely. 

Selection Rules. Let us now determine the conditions at which 
the dipole moment matrix element does not vanish in a given transi¬ 
tion. These conditions, imposed on the change of state of an atom in 
radiation, are known as selection rules. 

We start with the simplest case of a one-electron transition in an 
atom. We furthermore assume that the atom is light and we need 
not take spin-orbit coupling into account. Then the spin function 
separates out of the total wave function as a factor. But since the 
matrix element is taken only with respect to the coordinate, the 
condition for the spin is that its projection does not change in the 
transition. 

The principal quantum number imposes no restrictions: any 
variation is possible without the matrix element vanishing. There 
remain the orbital and magnetic quantum numbers. 

In Exercise 4, Section 34, it was pointed out that the operator y 
satisfies the same commutation relations with the angular momen¬ 
tum operator as those which govern the position operator in a com¬ 
mutation with the particle’s orbital angular momentum operator. 
But the condition that the matrix element (y) xyz does not vanish 
was developed solely from the commutation relations It follows 
that the coordinate has the same nonzero matrix elements with 
respect to the angular momentum eigenvalues and projection as y. 

This leads to the following selection rules for the dipole moment 
components: 

(i) The electron’s orbital quantum number, Z, can only vary in 
dipole radiation by ±1; if it was Z in the initial state it can bo 
only Z + 1 or Z — 1 in the final state. In other words, AZ = ±1. 

(ii) If the z axis is taken as the polar axis (the angular momentum 
quantization axis), then only the matrix elements e (z) kk , which 
are diagonal with respect to the magnetic quantum number, are 
other than zero. For them A k = 0. 

The matrix elements e(x ± iy)h f k are not zero when k' — k = 
= A k = ±1, according to the sign in x ± iy. This rule means tho 
following. If the radiation is polarized along the z axis, the electric 
field vector is determined by the zth component of the dipole mo¬ 
ment. Hence, the selection rule Ak = 0 refers to such radiation. If 
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the radiation is circularly polarized in the x, y-plane, the rule for 
its circular components, the electric field of which is proportional to 
e (x ± iy), is A& = ±1. For all three types of polarization (plane, 
and circular in both directions) the selection rule AZ = ±1 holds. 

Let us now explain the physical meaning of the selection rules. 
{Quantization of the electromagnetic field was achieved by a plane- 
wave expansion, when each separate harmonic wave was character¬ 
ized by a definite momentum and polarization. If an expansion in 
spherical waves is effected in a spherical cavity with mirror walls, 
we find that each wave is characterized by angular momentum and 
parity in accordance with the general laws of isotropy of space. 
Here, the angular momentum is given by Eq. (15.28). 

Spherical waves are represented quite independently of quantum 
theory with the help of the Y £ functions, where L and K are integers 

such that — L ^ K ^ L. This is the system of orthogonal functions 
on the surface of a unit sphere corresponding to rotation symmetry. 
The zth projection of the angular momentum is found to be 

M Z = ^K (36.33) 

If we then quantize, for a separate quantum (photon) we obtain 
E = fee), so that the projection of the angular momentum of a pho¬ 
ton, as of any particle, assumes only integral values. In addition to 
the angular momentum projection and energy, quantum mechanics 
•deals with the angular momentum square M 2 , the eigenvalues of which 
are h 2 L (L + 1), and K < L. 

Electromagnetic waves are transverse and cannot have spherical 
symmetry. Accordingly, the quantum number L is never zero: only 
the function YJJ is spherically symmetrical. The least possible value 
is L = 1. The symmetry and spatial configuration of the field 
correspond to electric or magnetic dipole radiation. For large values 
of L there are two multipole types for each L. 

Electric and magnetic radiation differ in parity. Electrodipole 
radiation has parity —1, magnetodipole, +1, electroquadrupole, 
+1, magnetoquadrupole, —1, etc., alternately. This rule is easily 
explained as applied to dipole radiation. The electric dipole moment 
is a vector, the magnetic dipole moment is a pseudovector (see 
Sec. 15). In an inversion, vector components change their signs, 
while pseudovector components do not. This affects their parities. 
Each successive multipole order reverses the parity, insofar as it 
corresponds to the appearance of an additional factor r in the multi¬ 
pole moment. 

In the case of a quantum, the angular momentum cannot be sepa¬ 
rated into orbital and spin angular momenta: this is possible only 
in nonrelativistic theory, while a quantum is a relativistic particle 
and its velocity is equal to c . But since its angular momentum cannot 
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be less than unity, we may say that its intrinsic angular momentum 
is equal to unity. However, the analogy with the angular momentum 
of a particle is not complete. For the case of l = 1, a particle has 
three states with the angular momentum projections k: — 1 , 0, and 1 . 
A quantum with unity angular momentum is a dipole and can 
occur in only two states, electrodipole and magnetodipole. 

Note the following. In a’Splane-wave expansion the state of a 
quantum is defined by four numbers, k x , k y , k z , and a . In a spherical- 
wave expansion we also obtain four numbers: co, L, K , and parity. 
Like a, parity takes on two values. 

It is now not difficult to interpret the selection rules as a conse¬ 
quence of the angular momentum and parity conservation laws. 
The established rules refer to electrodipole radiation. Consequently, 
the angular momentum of a radiating electron cannot change by more 
than unity (| AZ | ^ 1). The parity must change, since the electro- 
dipole quantum is odd. Hence, there can never be A l = 0. 

These selection rules are easily generalized for the case of suf¬ 
ficiently strong spin-orbit coupling and the electron state being 
characterized not separately by l and o z but by the total angular 
momentum /. Since angular momentum is not directly associated 
with parity, A j = 0 is not prohibited in electric dipole radiation. 
Only a (/ = 0) ->-(/ = 0) transition is strictly prohibited, because 
it is incompatible with the angular momentum conservation law. 
The angular momentum of a quantum cannot be less than unity, 
therefore the 0—^0 transition is prohibited in all approximations. 

Let us consider many-electron atoms. If the spin-orbit coupling 
is not large the atom’s spin function separates out as a factor of the 
spatial function. For example, in helium the total spin of the two 
electrons may be equal to unity, the ortho-state (Sec. 33), or to 
zero, the para-state. A transition from one to the other is prohibited 
because the wave functions of the ortho- and para-states are orthogo¬ 
nal, and the electric dipole moment does not depend on spin operators. 
Therefore, the dipole radiation spectrum of helium as it were sepa¬ 
rates into two spectra: one belonging wholly to orthohelium, the 
other to parahelium. This was known before quantum mechanics was 
developed, and it was explained by Heisenberg on the basis of wave 
function symmetry. 

The selection rules for the total angular momentum / of an atom 
are similar to the selection rules for the angular momentum of a sep¬ 
arate electron, /. The same is true of the selection rules for the 
zth projection of the total angular momentum. 

Transitions prohibited for electrodipole quanta may be accompa¬ 
nied by emission of magnetodipole or higher multipole quanta. 
Corresponding to this in optical spectra is the lower line intensity 
according to the expansion of the wave amplitude in powers of the 
small quantities vie and r/X (see Sec. 20). For that reason excited 
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states of atoms or nuclei capable only of high multipole-order radia¬ 
tion transitions have long lifetimes and small energy level widths. 

The Zeeman Multiplet. Let us apply the obtained selection rules 
to the Zeeman effect, that is, examine the spectrum of an atom in 
a magnetic field. We start with the case of a strong field, in which the 
splitting of spectral lines provides a simpler picture than in a weak 
field. 

In accordance with Eq. (33.52), both multiplet levels between 
which the transitions take place are split. Assigning the spin orbital 
angular momentum projections involved in this equation all per¬ 
mitted values, we see that all the splitting components are equidis¬ 
tant and multiples of integers L z + 2 S z , and because some values 
of L z + 2 S z are obtained in several ways, the levels corresponding 
to them are degenerate. 

For example, if L = 1, S = 1/2, the following set of values of 
the sum obtains: 1+1=2, 1 — 1=0, 0 + 1= 1,0 — 1 = —1, 
-1 + 1=0, -1-1 = —2. Splitting occurs into a total of 
five levels, of which the zero level is two-fold degenerate, that is, 
six states occur. In a weak field, L and S first combine into J, which 
assumes two values: / = 3/2 and / = 1/2. Their multiplicities 
are 4 and 2—a total of 6 states, as in a strong field. 

Suppose radiation is observed perpendicular to a magnetic field. 
The polarization vector of the radiation is perpendicular to the 
direction of propagation, that is, it is directed either along the 
magnetic field or in the third perpendicular direction, say along 
the x axis, if the magnetic field is parallel to the z axis. The selection 
rules differ for radiation polarized along z and along x. If spin-orbit 
coupling can be neglected, then S z is separately conserved in any 
polarization. But in that case, for radiation polarized along z both S z 
and L z are conserved. Consequently, when a transition occurs be¬ 
tween states split according to Eq. (33.52), these numbers cancel out, 
yielding one unsplit line. 

A wave polarized along the x axis can be represented as a super¬ 
position of two circularly polarized waves of opposite sense. The 
selection rule for the corresponding lines is that L z can vary only 
by ±1. Consequently, radiation polarized along the x axis has 
a frequency differing from the initial one by ±e | H |/(2 me). Thus, 
irrespective of the total number of splitting components of every 
level in a magnetic field, a spectral line in a strong magnetic field 
splits into three lines the separation of which is equal to e | H |/(2 mc) r 
that is, Larmour’s frequency for the given field. 

If we drill a hole in the armature of an electromagnet, it is pos¬ 
sible to observe radiation along the field. It is circularly polarized 
in the ^,y-plane. The selection rules for clockwise and counter¬ 
clockwise polarization correspond to a variation of L z by ±1, so 
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that two lines are observed at a distance e\ H |/(2 me) from the 
middle. When the magnetic field is switched on, the initial line 
splits into two lines, the separation between which is equal to the 
double Larmour frequency. The splitting pattern is exactly the 
same in the classical vibrational motion of a charge placed in a 
magnetic field. This problem was examined in Exercise 4, Sec¬ 
tion 20. 

The splitting of spectral lines in a magnetic field was discovered 
by Zeeman before quantum mechanics appeared, which is why at 
the time the theoretical explanation of the effect corresponded to 
the classical problem, where it is assumed that the charge is in 
vibrational motion. 

But such a spectral pattern is observed only in a strong magnetic 
field, which produces greater splitting than the separation between 
multiplet levels. In these conditions the Zeeman effect is said to be 
normal , because superficially it corresponds to the theoretical no¬ 
tions of the time when it was discovered. Note that a field that is 
strong for one multiplet may still be weak for another. 

In a weak magnetic field the Zeeman effect is said to be anomalous . 
The spectral pattern is quite unlike the classical. First of all, 
the number of splitting components may differ from the normal. 
Their separations may also be quite different. 

Let us consider, as an example, the anomalous Zeeman effect 
for the so-called D line of sodium vapour. In the absence of an ex¬ 
ternal magnetic field the line is double. It corresponds to two transi¬ 
tions: 2 Pi/ 2 -+ 2 Si /2 an d 2 P 3/2 2 S i/ 2 . The 2 P level has orbital an¬ 
gular momentum unity and half-integral spin. Hence the total angular 
momentum J may assume two values: / = 1 + 1/2 = 3/2, and 
J = 1 — 1/2 = 1/2. This yields the doublet structure of the 2 P 
level in the absence of an external field. The 2 S 1/2 level cannot split, 
since its orbital angular momentum is zero. A double D line ap¬ 
pears in the sodium spectrum in a transition from a doublet to a 
singlet level. The separation for its frequency components is approx¬ 
imately one-thousandth of the mean frequency of the doublet. 
The 2 /> 3/2 level is higher than the 2 P 1/2 level. 

Let us calculate the Lande factors for the three levels. From (33.51) 
we have: 


(1) 2 P 3/2 : J = 3/2, L = 1, 5 = 1/2 

* 1 3/2 x 5/2-(-1/2x 3/2 — 1x2 4 

+ 2 3/2 X 5/2 3 

(2) 2 P\/ 2 - J = 1/2, L = 1, S = 1/2 

a , 1 1/2x3/24-1/2x3/2-1x2 2 

2 1/2 x 3/2 3 
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(3) 2 S i /2 : / = 1/2, L = 0, 5 = 1/2 

1 , 1 1/2 x 3/2+1/2 x 3/2 _ 0 

+ 2 1/2 X 3/2 

We shall measure the splitting energy E' in units of P|H| so that 

E ' = gJ z (36.34) 

where g is the corresponding Lande factor. We then obtain the follow- 
ing splitting pattern for the levels 2 P 3/2 , 2 P i/2 , and 2 5 1/2 : 

(1 , £ '(-4)=-2, £ '(-4) 

''(t)-t. 

< 2 > £ '(-T) = -r' e '( t ) = t 

® e '( 4 -) = 1 

Let us find what the spectrum looks like. We start with the transi¬ 
tion 2 P ij2 -^ 2 5 1/2 . Vibrations polarized along the field are subject 
to the selection rule A J z = 0. Hence, their frequencies are shifted 
with respect to the median position by 

i)- E '( 2E .«.'4)=T- 1 “-T 

Unlike the normal Zeeman effect, we have also obtained a double 
line for radiation polarized along the field. 

For polarizations perpendicular to the field we have 

i) 


3 


- 1 = -T 


(clockwise polarization) 


(*«•«• —j) 

1 4 

= — + 1 (counterclockwise polarization) 

u O 


Refferring now to the transition a / > 3 / 2 “^ 2 *S , i/ 2 » if the vibration is 
polarized along the field, we have, of course, two lines again, but 
with a different separation from the median position: 


-i)_4)=_4+i_4 

e '( i P«.4)- £ '( !E '«’t)=T- 1 =-¥ 





Quantum mechanics 


527 


For clockwise polarization we obtain 


E' i 

( v >«- -t) 

1 ~E' 

- 4 ) 

= —2 + 1 = —1 

E' 

(*>«• - 4 ] 

\~E' 

(*«■«• 4 )= 

2 5 

3 3 


Correspondingly, for counterclockwise polarization the splitting 
is characterized by the numbers 1 and 5/3. 



Thus, one component of the D line splits into six Zeeman com¬ 
ponents, and the other into four. The Zeeman effect remains anom¬ 
alous as long as the magnetic field is weaker than 5000 statoersted. 
The splitting pattern is presented in Figure 41. 


EXERCISES 

1. Develop the splitting pattern for the multiplet 3 P[and Z S and transi¬ 
tions in a weak and a strong magnetic field. 

2. Calculate the linear momentum of an electromagnetic field in vacuum 
with the help of the normal field coordinates ()£. 
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Solution. Substitute the electric and magnetic fields according to (36.6) 
and (36.7) into the expression (15.25): 

p=4rf ( £ X H ) dV 

After integrating over the volume we obtain, making use of (36.11), 

P=li- 2 “ k UA2X(kXA°_ k )] + lA2X(kXAr)l 

+ [Af X (k X AJ')] + [AJ* X (k X Al' k *)]} 

We transform the double vector products and get 

P = 153- 2 “k l k (Ak• A°_ k ) + k (A°* ■ A a _* k ) + 2k (A°* • A°)] 

k, a 

The quantities k (A£-A^_ k ) and k (A£*-A^* k ) are odd functions of k, 
and after summation over all k they vanish, leaving only 

P = ^2 (0 kk(Ar-A£) 

k, a 

Substitution of the normal field coordinates from (36.17a) and (36.176) yields 
p= 2 ^-2-[(^) , +^«?k) , ]=2 hk ( N ko+j) 

k, a k k, a 

so that (14.13) gives the relationship between the momentum and energy 
of a quantum. 

3. Obtain the commutation relations for the operators a k(J and a£ a? 
and determine a kg a kg and fl kg fl kg . 

Answer. a kg a kg and a kg a£ a are diagonal; aj; g a kg = ^ka + * an( * 
fl ka a kCT = ^ka* F rom this we have the commutation relation 

a ka fl ka a ka a ka = ^ 

For different k, a and k', o' the operators a£ a and a k , a , commute. The ope¬ 
rators a k(J and a k , g , apparently always commute, the same as a£ g and a£, a ,. 


37 


THE DIRAC EQUATION 

Spinors in Four Dimensions. In the preceding section we developed 
the quantum theory of a relativistic particle, the light quantum. 
We were not concerned with the relativistic invariance of the equa¬ 
tions, because it was inherent in the classical system, in Maxwell’s 
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equations. There is no such initial wave equation for the electron, 
because there is nothing in the limiting classical case that corres¬ 
ponds to its wave function (Sec. 36). Here we must from the outset 
require that the equations be relativistically invariant. 

In any case, it is obvious that the electron wave equation, which 
is invariant with respect to the Lorentz transformations, must of 
necessity take account of spin, because spin-orbit interaction is 
a relativistic effect. As was shown in Section 30, the wave function 
of a particle with half-integral spin is a two-component spinor quan¬ 
tity, which in rotations of the coordinate axes transforms through 
half-angles. Since the Lorentz transformation can be likened to a 
rotation of the coordinate system (Sec. 13), the definition of a spinor 
should be extended to rotations in four-dimensional space, by 
analogy with the extension of the vector definition to four-vectors. 

In the most general case the transformation of a spinor in the rota¬ 
tions of a coordinate system takes place according to the law 


Ei — ag* -f- pg 2 
~2 — + 6^2 


(37.1) 


Then, as is readily apparent, the following combination of the 
components of two different spinors, g and r|, 

—KVi = Sir|2 — £ 2^1 (37.2) 

remains invariant if 

<x6 — Py = 1 (37.3) 

If g 2 and g 2 are the components of a spinor wave function, then 
I li I 2 + I £2 I 2 * s th e probability density of the particles occurring 
at the given point in space. In three dimensions this is a scalar quan¬ 
tity, which imposed certain restrictions on the transformation coef¬ 
ficients, namely a* = 6 and (i* = —y. 

In four dimensions the probability density should be considered 
as the fourth component of a vector, therefore, if in the ordinary 
rotations of the coordinate axes we also take into account the Lorentz 
rotations, then the transformations of a certain spinor (g 2 , g 2 ) and 
the transformations of the complex conjugate spinor are not linked 
by the same relationships as in three-dimensional space. Accordingly, 
in future we shall denote such spinors by an asterisk (another con¬ 
vention is to place a dot over the spinor index). 

Suppose a pure Lorentz rotation is performed through an imaginary 
angle 00 the tangent of which is equal to iV/c. Then, as can be seen 
from (30.42), g 2 and g 2 are replaced by g^ 0 ^ 2 and g 2 e _i<d / 2 . But since 
o) is an imaginary angle, the rotation equations for the spinor com¬ 
ponents take the form 

& = (37.4) 

J4-0452 
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So as to arrive sooner at the standard notation of Dirac’s theory 
we shall assume that the relative velocity V of the reference frames 
is directed along the z axis. Then the Lorentz transformations (13.17) 
and (13.18) can be expressed as 

/ z — Vt _ t — V.r/'c 2 

z ~~ (i —fvc 2 ) 1/2 * — (i — i' 2 c / 2 ) 1/2 

We form linear combinations 


ct' zb z' = (ct zb z) 


1 ±V/c 

(1_T/2 /c 2)1/2 


Substituting VIc = —i tan co into this, we find 

ct' ± z' = (ct zb z) e* i<0 = (ct zb z) e ±|0)| (37.5) 

since 0 is a purely imaginary quantity. 

Since in the present case transforms exactly like (because 
e~\ w I / 2 is real), we arrive at the following correspondence between the 
transformations of the products of the spinor components multiplied 
by their complex conjugates, and the vector components: 

El*!; = Sfoe-I®!, ct' + z' = (ct + z) e\"\ 

ct' — z ' = (ct — z)e- l«l 

This Lorentz transformation does not affect the x and y coordinates 
and does not alter the products £*£1 and £*£ 2 - To establish the 
correspondence between them, consider a spatial rotation about the z 
axis: 


x' = x cos <p + y sin <p 
y' = — x sin q) -f- z/ cos (p 

Multiply the second equation by ±i and add with the first to get 
x' zb iy' = (x zb iy) e Ti w (37.6) 

From this we arrive at the correspondence 

It’ll =i*!ie i,p , x’ + iy' = (x + iy)e~ i, f 
IVll = i?eae- iq> > x' — iy’ = (x — iy) e i<p 

Thus, if there is given a four-vector momentum p t , p x , p y , p z 
(p t = Etc) and a spinor (£ x , £ 2 ), the following relativistically in¬ 
variant quantity is developed from them: 

It (Pt + Pz) + It (Pt — Pz) li 

+ l* (Px + iPy) il + I* ( Px — iPy) I 2 (37.7) 
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Here we have written the components of the four-vector of momen¬ 
tum between the spinor components, since subsequently p should 
be considered as an operator. 

The relativistically invariant Lagrangian, from which follows the 
electron wave equation, must be expressed in terms of the relativ¬ 
istically invariant quantity (37.7). But the Lagrangian apparently 
involves, in addition to the momentum of the electron, also its 
mass, which is itself relativistically invariant and should therefore 
be multiplied by a like quantity. 

As we saw at the beginning of this section, a scalar may be devel¬ 
oped only from the components of two spinors, say £ and r], and also 
from the conjugates £*, r)*. These scalars are | 1 ri 2 — and 

|*r]* — I* 1 !*- Then the Lagrangian also involves an expression of 
the form (37.7) constructed from the components of t). 

The labelling of the components with asterisks is used to stress 
their complex conjugate transformation law with respect to (37.1). 
To go over to conventional quantum mechanical notation it is more 
convenient to put 

i*=^, 

< = = •n*='l’3, n2 = ^* 

It should be remembered here that and i|) 4 transform not like 
and i|) 2 , but according to complex conjugate equations. 

Let us now write the Lagrangian, denoting the wave function in 
terms of . . ., \|) 4 : 

L = t? (Pt + Pz) 'Pi + 'p* (Pt—Pz) % + 'P£ (Px + ipy) tpl 
+ Ip? (Px — iPy) 'Pz + »Pt ( Pt + Pz) ^4 + Ip * ( Pt — Pz) lp3 

+ (Px + iPy) lp3 + ^3 (Px — iPy) % 

— me (Ip t 4pj — ipaipj) — me (4pfip 3 — Tpjip*) (37.8) 

Note that the coefficients m can always be made the same by a due 
choice of factors multiplying the components. Therefore in this 
sense (37.8) does not involve any restriction of generality. 

The Dirac Equation. We now vary L with respect to \|?*, \|)*, if*, 
and ypl and equate the variations to zero. We write the operators 
operating on these functions on the left, making use of their Hermi- 
ticity: 

(Pt + Pz) 'Ih + (Px — iPy) ^2 - mety 3 = 0 

(Pt — Pz) ^ + (Px + iPy) 'ti + mc\ |) 4 = 0 

(Pt — Pz) ^3 + (Px — iPy) ^4 — mety 1 = 0 

(Pt + Pz) ^4 + (Px + iPy) ^3 + = 0 (37.9) 

34* 
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This is the required Dirac equation. The expanded form of 
Eq. (37.9) is rarely used, symbolic matrix notation usually being 
employed. For that the Pauli matrices (30.31) are used. Let us sup¬ 
pose that these matrices operate similarly on each pair ^ i|? 2 , and 
\p 3 , so that from (30.32) we obtain 



As can be seen from (37.9), we also need matrices that interchange 
the pairs and i|) 3 , i|) 4 . We denote these matrices p 1? p 2 , and p 3 . 

They operate in exactly the same way as o x , o y , and o 2 , but with 
respect to each pair, as though we were using the notation = 
= ^i» ^2 = ^3 = ^4 = ^ 2 - other words, they would operate 

only on the superscript. The commutativity of p and a is readily 
apparent from this, but we shall not use the two-index notation. 
It has been cited only to show the commutativity of p and a more 
vividly. 

Now consider the system (37.9). The components of attached to 
p t are in normal sequence, hence we have a unit matrix (which we 
do not write). The components of attached to p z also appear in 
normal sequence, but with different signs. Here we must write 
p 3 o z , then , i|} 2 and \|) 3 acquire minus signs. Attached to p x is the 
matrix a x , which does not interchange the first and second component 
pairs, and a y is attached to p y for the same reason. Finally, attached 
to m is the matrix —Piff z . We also introduce the following abbre¬ 
viated notation: 

a x =—o x , a y =—a y , a z =—p 3 cr z , P = p 4 a z (37.10a) 
Then in the abbreviated form the system (37.9) is written as follows: 

M = (<z x p x + o y Py + a t p z + pmc) = api|? + (37.11) 
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The operators a x , o y , and o z , as well as the operators p x , p 2 , and p 3 , 
possess the following properties (see Sec. 30): 

ol = ol=o 2 z = 1 

v x o y + OyO x = 0 x <*z + o z o x = cr y o z + <J z a y = 0 
P?=P 2 2 = P 3 = 1 

P1P2 + P2P1 = P1P3 + P3P1 = P2P3 + P3P2 = 0 
Using this, we easily find the analogous properties of a and p: 
S*=ij = S; = p2 =1 (37.12) 


a x a y + a y a x = a x a z + a z a x = a y a z + a 2 a y 

= a x p + fia x = a y P + pa y = a z p + pa z = 0 (37.13) 

We apply the operator p t = —(h/ic)(d/dt) to the left-hand side 
of (37.11), and the operator ap + $mc to the right-hand side. 16 
Then, using (37.12) and (37.13), we obtain 

Ptty = + m^c 2 ^ (37.14) 

since all the products of the different operators a i9 p cancel out, 
and the squares of operators are equal to unity. Thus, all the operators 
interchanging the functions have disappeared, and we are left with 
a wave-type differential equation 

-w-* ( 37 - 15 ) 


Its relativistic invariance is obvious. Applied to a free particle, 
its solution must be sought in the form of a plane wave 

r /h (37.16) 

whence follows the correct relativistic relationship between energy 
and momentum: 

E 2 = c 2 p 2 + m 2 c 4 (37.17) 

Spin has also been eliminated from the wave equation (37.15), 
which refers to a one-component wave function. For this reason 
Eq. (37.15), which was independently enunciated by Schrodinger, 
Fock, Klein and Gordon as an obvious (at first glance) generaliza¬ 
tion of the nonrelativistic wave equation, does not apply to the 
electron without taking Dirac’s general equation (37.11) into ac¬ 
count. 

16 We do not write the “cap” (~) over p t so as to stress that this quantity 
is not a usual quantum mechanical operator. The operator notation is appropri¬ 
ate if it is understood as the Hamiltonian divided by c. 
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We shall now show that the Dirac equation is invariant not only 
with respect to rotations and Lorentz transformations, as having 
been developed from the relativistically invariant Lagrangian, but 
also with respect to an inversion of the coordinate system. In an 
inversion, p transforms to —p, so that the Dirac equation acquires 
the form 

p t \—a p\f> + 


We multiply both sides by P and take advantage of the fact that, 
from (37.13), j3a = — a{5. Then 

MN 5 ) = «P(N) + 'MKP 1 I j ) (37.18) 

Thus, the function |h|) satisfies the same equation as the initial 
function, ty. But since p-ij) is in principle equivalent to ty, we may 
assert that the Dirac equation has been reduced to its initial form. 
Multiplication by P is a certain special unitary transformation of the 
wave function. 

The known unitary transformation is usually performed in a way 
such that matrices a and p, while retaining their basic proper¬ 
ties (37.12) and (37.13), are differently expressed in terms of ma¬ 
trices p and o. We shall not perform the transformation itself and 
simply write the new expressions similar to (37.10a): 

(X 3C =PiO r x , 0&!/ = Pl ( h/» &z = Pi®zi P = P3 (37.106) 

The Dirac equation employing such a and P is more convenient 
for the limiting transition to the nonrelativistic wave equation. 
It is easy to verify that a and P determined with the help of (37.106), 
really do possess the properties (37.12) and (37.13), which alone 
is necessary to obtain the correct energy-momentum relation (37.17). 

Energy Eigenvalues. It follows from Eq. (37.17) that 

E=± (c 2 p 2 + m 2 c 4 ) 1/2 (37.19) 

The energy eigenvalue of an electron determined from the Dirac 
equation may be negative as well as positive. In classical mechanics 
only the plus sign is taken, because free electrons do not have nega¬ 
tive energies. 

The absolute value of the square root in (37.19) is not less than me 2 , 
so that there is an energy gap of width 2 me 2 to which the energy of 
the electron cannot belong. In classical equations all quantities vary 
continuously, therefore energy, once defined by a positive sign, 
does not cross the prohibited 2 me 2 gap and always retains the appro¬ 
priate sign. In other words, energy positively defined by initial 
conditions remains positive according to the equations of motion. 
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But in quantum theory discontinuous transitions between differ¬ 
ent states are also possible. For example, an electron with energy 
greater than me 2 could emit a light quantum and remain with an 
energy smaller than — me 2 . However, electrons with negative energy 
and mass are not observed in nature. Their properties would be very 
strange indeed: while radiating they would dissipate their energy, 
collapsing, so to say, into a state with energy E = — oo. Very soon 
all the electrons in the universe would have collapsed into that 
state, contrary to what we observe all around us. 

Thus, the Dirac equation allows for the possibility of states which, 
on the one hand, cannot be rejected, because electrons could transfer 
to them from known states, while, on the other hand, electrons with 
negative energy nonetheless do not exist. At the same time the 
Dirac equation describes a variety of electron properties absolutely 
correctly: as we shall soon see, it yields a relationship between spin 
and magnetic moment that agrees with experiment, leads to the 
precise formula of the fine structure of the hydrogen atom, etc. 
Furthermore, mathematical investigations show that there is no 
substantially other relativistically invariant equation for a particle 
of half-integral spin and nonzero mass. Our derivation of the Dirac 
equation from the invariant expression for the Lagrangian developed 
from spinors points to this convincingly enough. It would therefore 
be wrong to reject the Dirac equation out of hand: it is much better 
to supplement it with some suitable physical hypothesis. 

Dirac suggested reformulating the concept of vacuum. Formerly 
by vacuum was meant a state of the electromagnetic field without 
electric charges. Dirac felt it possible to describe as vacuum a state 
in which all levels with negative values are occupied by electrons. 
That this redefinition is not just an exercise in semantics and it 
carries physical meaning will be made apparent shortly. 

If all levels with negative energy are occupied, then by Pauli’s 
exclusion principle electrons cannot transfer to them from positive- 
energy states. Thus, Pauli’s exclusion principle is essential for the 
relativistic theory to be able to describe electron properties at all. 
Therein is the substantiation of Pauli’s principle as a necessary 
element of quantum mechanics. In nonrelativistic theory Pauli’s 
principle exists simply as a supplementary postulate of the many- 
body problem. 

In Section 36 we defined vacuum as a state of an electromagnetic 
field in which there are no quanta, in other words, the ground state 
with the least possible energy. In the same way, if all negative- 
energy levels are occupied, the remaining electrons can no longer 
reduce their energy by passing into negative-energy states, and 
consequently, a state with only occupied negative levels possesses the 
least possible energy with respect to the electrons. Such a state is 
termed the electron field vacuum. 
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Pair Production. The expression “electron field” is used by anal¬ 
ogy with electromagnetic field for the following reason. The Dirac 
equation is essentially never applied to one electron: it always im¬ 
plies the existence of a “background”, that is, negative-energy states 
occupied by other electrons. Otherwise th^t electron would itself 
pass into a negative-energy state. 

But the “background” not only guarantees the electron from 
“falling”; its existence is manifested in a real physical process. 
In an external electromagnetic field, for example, close to a nucleus, 
a quantum with energy greater than 2 me 2 is capable of catapulting an 
electron from a negative-energy state to a positive-energy state. 
The external field is necessary to satisfy the momentum conservation 
law. For the proof of this simple assertion see Exercise 1. 

But the removal of an electron from the negative-energy state 
leaves a “hole”, that is, an unoccupied level (cf. Sec. 33). In an 
electric field, electrons of negative mass (mass is of the same sign as 
energy) move not against the field, towards the anode, but along the 
field, towards the cathode. The “hole” moves together with them, 
thus behaving like a positively charged electron with positive mass. 

Experimentally, the ejection of an electron from the negative- 
energy state should produce two charges, one negative and one posi¬ 
tive. This theoretical prediction was subsequently confirmed by 
Anderson. 

A positron and an electron may mutually annihilate on impact 
if the electron transfers to an unoccupied level in the negative- 
energy states. Its energy is transferred to the electromagnetic radia¬ 
tion in the form of two or three quanta. Annihilation in vacuum 
cannot produce one quantum, as this would violate the momentum 
conservation law, just as one quantum cannot produce an electron- 
positron pair in vacuum. But in a nuclear field one-quantum annihila¬ 
tion is possible. 

Electrons and positrons are known as particles and antiparticles 
because of their ability for mutual annihilation. Antiprotons and 
antineutrons have also been discovered. 

Owing to the “background”, the relativistic quantum theory of the 
electron is in effect a theory not of a separate particle but of a field 
in which the number of particles is not defined. Depending on the 
energy, one or several pairs may appear in the field in addition to 
the one electron, much as quanta are emitted in an electromagnetic 
field. Only the total charge is strictly conserved, but not the number 
of particles. 

If the energy is insufficient for the actual production of pairs, 
transitions may exist in which pairs appear and then vanish. The 
intermediate states are of such short duration that their energy is 
quite indeterminate, as in the case of an alpha-particle below the 
potential barrier on ejection from a nucleus. Such intermediate states 
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can be detected in observable physical effects. For example, in the 
Coulomb field of the nucleus, polarization of vacuum as it were occurs, 
that is, a resultant displacement of the “background” due to pair 
production and annihilation. That is why the field acting on an 
electron close to the nucleus (of the order hi (me) ~ 10 -11 cm) is 
not strictly a Coulomb field. This affects the configuration of the 
energy levels of atomic electrons (see further on). 

Thus, the Dirac equation led to much more than a simple refine¬ 
ment of quantum mechanics in the relativistic domain. The concept 
of field was extended to particles, which led to the prediction of 
antiparticles. 

Charge Conjugation. From what has been said of the positron one 
could gain the impression that it was something of an entirely differ¬ 
ent nature than the electron: the electron is a particle, the positron 
is a “hole”. An apparent asymmetry developed between the charges 
of the two signs. However, a theoretical formulation is possible 
which completely restores the symmetry between the electron and 
the positron. 

As said before, Eq. (37.11) allows for solutions corresponding to 
both positive and negative energies. Furthermore, to each energy 
sign there corresponds two spin projection signs, making for a total 
of four solutions. Those corresponding to the positive sign of energy 
have physical meaning. A filled background was added to eliminate 
the real appearance of the second two solutions, which have no 
natural confirmation. 

The Dirac equation can be so transformed that the equation 
describing the positron becomes entirely the same as the equation 
for the electron. We are now speaking of a positron, that is, of an 
independent particle of positive energy, not a “hole”. At the same 
time, such theoretical predictions as electron-positron production 
and annihilation and the polarization of vacuum remain valid, be¬ 
cause the equations are from the outset written for fields, not sepa¬ 
rate particles. 

Let us consider a transformation of the Dirac equation which leads 
to the wave function of the positron. 

If the Lagrangian is varied with respect to the yp* components 
rather than the yp components, the expression (37.8) yields Dirac’s 
equation for a complex conjugate function. In this equation it is 
convenient to first write the wave function on the left of the opera¬ 
tors, as it appears in the Lagrangian. In performing the variation 
we must also make a transformation by parts, mindful that L itself 
appears under the sign of a four-dimensional integral in the action 

expression, S = j L d 4 x. In this way the derivatives involved in p t 

and p are transferred from the variations to the required functions, 
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with pi accordingly being replaced by — p t , and p by — p. In the 
complex conjugate equation these operators operate not from left 
to right but, by definition, from right to left: 

( — Pt +ap- mc$) = 0 (37.20a) 

The operators a and p also operate from right to left, that is, the 
Mh component of the wave function, \J)*, is multiplied by the matrix 
element in the kth row rather than the matrix element in the kth 
column. In equation form it looks like this: 

4 4 

2 = 2 'P*afti = 01>*a)i (37.21) 

h= 1 h= 1 

Consequently, to return to the usual notation, the columns and 
rows of the matrices a and p must be interchanged. They are then 
denoted as a and p, and Eq. (37.20a) becomes 

(— p t + ap — map) = 0 (37.206) 

But there also exists a transformation which reverts equa¬ 
tion (37.21) completely to the initial form of the Dirac equation 
(37.11). Let us obtain this transformation, denoted by the letter C. 
It should be applied as an operator to Eq. (37.20a) on the left, 
requiring that as a result of the commutations with the operators a 
and p we obtain an equation for the function C\p* identical in form 
with (37.11). In other words, C must satisfy the following commuta¬ 
tion relations: 

Ca = aC (37.22) 

Cp = —pC (37.23) 

The specific form of C depends upon the choice of the matrices a 
and p. We assume that they satisfy the relation (37.106). The ma¬ 
trices p and o are Hermitian; hence, if they are made up of real 
elements, an interchange of rows and columns alters nothing. If they 
consist of purely imaginary elements, like the elements of p 2 and cr y , 
an interchange reverses the sign of the matrix. Now, substituting 
in place of the components a*, a„, and a z and P their expressions 
(37.106), we rewrite (37.22) and (o7.23) in the form of equations in 
components: 

Cp l O x = PiO x C, CptOy— Pi GyC 

C9iO z - PiU z 6 , Cp 3 = —p 2 P 


(37.24) 
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It follows from this that 

C = p 2 a y (37.25) 

Since p 2 is anticommutative with p 1? and a y is anticommutative 
with o x , the first equation in (37.24) holds. The other equations 
are verified in the same way. 

Thus, operating with C on Eq. (37.21) and interchanging it with 
the operators a and [5, we obtain 


(Pt — a P — mc$) C\|)* = 0 (37.26) 

which is fully identical with (37.11). But the complex conjugate 
function depends upon the coordinates and time according to 
the law 


(0) e iEt I h ~ wt* (37.27) 

To it also correspond two energy signs: E = ± (c 2 p 2 + m 2 c*) i/2 . 
If we substitute energy with the negative sign into (37.27), reverse 
the direction of the momentum p, and subject ty* (0) to the C trans¬ 
formation, we obtain a wave function of a particle of positive energy 
satisfying the same equation as the electron. The momentum sign is 
taken the reverse that of the electron so as to have the same spatio- 
temporal dependence of the wave function. 

We have thus proved that the function C \can be considered as 
belonging to a positive electron (according to the sign of the mo¬ 
mentum) with positive energy. In other words, Cty*E (— p) is a 
function of a positron moving in a field in the opposite direction of 
an electron. 

The C transformation effects the transition from an electron to 
a positron. But since C 2 = 1, the same transformation transforms 
“positron” equations into “electron” equations, which establishes 
the symmetry between both particles. 

The C transformation is known as charge conjugation , it “transfers” 
particles into antiparticles. 

W. Pauli and V. F. Weisskopf showed that particles without spin 
can also have antiparticles. Subsequently such a particle was in 
fact discovered: the Jt-meson; jt+- and Ji_-mesons possess particle- 
antiparticle properties. 

If the C transformation does not alter the form of the wave func¬ 
tion of a particle, then the particle and antiparticle are identical. 
Such particles are at present termed absolutely neutral, as distinct, 
for example, from the electrically neutral neutron, which neverthe¬ 
less has its antiparticle. Absolutely neutral are the Jt 0 -meson and 
the light quantum. 
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The Transformation to the Nonrelativistic Wave Equation. 
It is instructive to compare the relativistic wave equation with the 
Schrodinger equation with the help of a limiting process. Since we 
are primarily interested in the motion of an electron in an external 
field, we first write the corresponding Dirac equation for an electron 
interacting with an electromagnetic field. For this we replace the 
momentum p by p — (< etc ) A, and the energy E by E + ey (Sec. 14). 
Thus, Dirac’s equation for this case takes the form 


Pt$ = <* (p — 7-A) Up + + ecp^ 


(37.28) 


We assume that the operators a and p have been chosen according 
to (37.106). We expand only p, and not o. In other words, we treat 
the first pair \J) 2 ) and the second pair (^ 3 , \|) 4 ) as one entity (a 
bispinor ): 


Xi = 




(37.29) 


We know how p 2 and p 3 operate on them. Writing this down in 
explicit form, we obtain Eq. (37.28) for and % 2 : 

f-Xi = PtXi = o (p —-7 A ) X2 + ™cXi + -y-Xi 

■j-X2 = PtX2 = o (p— j- A ) %i — mc X2 + ^r%2 (37.30) 

From the second equation it follows that 


^ E/c + Jc-ecp/c ° (P— 7 A ) ( 37 ‘ 31 > 

In nonrelativistic motion, when v c, the energy E of the particle 
is very close to me 2 , so that the whole denominator in (37.31) is 
in the first, initial approximation replaced by 2 me. Now, substitut¬ 
ing X 2 into the first equation of (37.30), we find that the two-compo¬ 
nent function satisfies the following equation: 


(E — me) Xi = (p, — mc) xi 


1 

2 me 



(37.32) 


The difference E — me in the left-hand side of the equation is 
the nonrelativistic Hamiltonian of the particle, JSf, divided by c. 
We transform the first expression attached to Xi in the right-hand side 
of (37.32), which, after multiplying by 2mc, we write as follows: 


[<*•( P—7- a )] 2 = (o*P) 2 +-J(o- a ) 2 


—7 I(OP) (o-A) + (o.A) (op)] (37.33) 
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Here the first term is 


(op) 2 = olpl + o*pl I + vfpt + (o x Oy + a v a x ) p x p y 

+ (°x<*z + Oz<*x) PxPz + (°Wz + <*zOy) PyPz 

= P t x + Py + Pt = P i (37.34) 


and similarly 

(o-A) 2 = A 2 


(37.35) 


Making use of the anticommutativity of the components of o, 
we reduce the last term in (37.33) to the form 

(o-p) (o.A) + (o.A) (o-p) 

= (A-p) + (P-A) + [(p x Ay — A y p x ) - (PyA x — A x Py) ] IQ z # 

4- l(P z A x — A x p z ) — (p x A z — A z p x )) io y 

+ [{PyA z — A z p y ) — ( p z Ay — A y p z )] io x (37.36) 

where the properties of the Pauli matrices, (30.34a)-(30.34c), have 
been used. 

The commutator p x A y — A y p x is equal to 

h / d A a d \ h dA y 


— l—A —A —\-JlJjL 

i \ dx ^ dx ) - i dx 


(37.37) 


so that the factor of o z is the z component of the magnetic field, H z \ 


/ dAji 


dA x 


dx 


dy 


)- 


H* 


The same holds for the other two components of H. 
Collecting all the terms in (37.32), we obtain 


± ( p ■- f A) 2 X , + cTOi - ’«) (37-38) 


The first two terms in the right-hand side of (37.38) represent 
the nonrelativistic Hamiltonian of a spinless particle in an external 
electromagnetic field. The last term is the energy of the magnetic 
moment 


£ 


eh 

2 me G 


(37.39) 


in the external magnetic field, H. Thus, from the Dirac equation 
we have obtained the correct relationship between the magnetic 
moment and the angular momentum of an electron, (30.51). 

Also of interest is the development of a formula for the interaction 
energy between the magnetic spin and orbital angular momenta of 
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the electron. This quantity involves the square of the speed of light 
in the denominator, and is therefore obtained in the higher than 
(37.38) approximation. We, however, shall not seek all the terms of 
this approximation and take only what is of interest to us. We 
can see from Eq. (37.31) that the higher-order term in c ~ 2 is 

*;=4W a (p-f A )ie < 37 - 40 > 

Substituting the correction (37.40) into (37.30), we see that an 
operator involving both the electric field and the spin can only be 
obtained from the expression 


(o-p)e<p(0-p) (37.41) 

Let us interchange the potential, (p, with the second factor (o*p). 

• Using a formula similar to (37.37), we can write 

ey (a- p) = (a- p) ey —(a.grad cp) = (o«p) ecp — ih(o- E) 

The product (o-p)(o*E) expands according to the general rule 
(30.36): 

(a • p) (a. E) = (p • E) + Co (p X E) 

Here, the first term does not depend upon the spin and is immaterial 
for the spin-orbit interaction. In the second term we make use of the 
fact that the self-consistent field acting on the electron is a central 
field, so that E = — (r/r)(dcp/dr), and r X p is the orbital angular 
momentum operator of the electron, 1. Expressing it in terms of units 
of fe, we obtain the formula for the energy operator of the spin-orbit 
interaction: 


t>_ eh * 
4mV 




(37.42) 


Equation (37.38) is applicable only to the electron. Although the 
proton also has half-integral spin, its magnetic moment is 2.9 times 
greater than obtained in the last term of (37.38). The neutron also 
possesses magnetic moment, equal, in the same units, to 1.9. At the 
same time, according to Dirac’s theory a neutral particle should 
not have any magnetic moment at all, since e in (37.38) denotes 
charge. 

Usually the following explanation is offered for the reason why 
the proton and the neutron are not subject to the Dirac equation. 
Both nuclear particles interact very strongly via a meson field. 
For that reason they continuously emit and absorb mesons, in the 
same way as an electromagnetic field produces electron-positron 
pairs which immediately annihilate. But whereas such pairs do not 
make a great contribution to the total electromagnetic effect (since* 
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the measure of charge-field interaction is a small quantity e 2 l(hc) = 
= 1/137), nuclear interactions (which are very strong) make it 
impossible to consider neutrons and protons as isolated from the 
meson field surrounding them. 

This explanation has not been verified by quantitative computa¬ 
tions, because the theory of nuclear forces has not yet been devel¬ 
oped. However, a direct study of the electromagnetic field of the 
proton and the neutron with very fast electrons indicates that both 
are indeed surrounded by charges and currents in a region of about 
10“ 14 cm. 

The Fine Structure Formula. As an exception, we shall present 
without proof an important formula for the energy eigenvalues of 
an electron in the hydrogen atom or in any Coulomb field of charge Ze : 

_0C2Z2_\ -1/2 

{ n - (/ +1 /2) + [(/ +1 /2)2 - aW} 1 /«}* / 

(37.43) 

Here, n is the principal quantum number, / = l ± 1/2 is the total 
angular momentum of the electron, and a = e 2 (he). If Ze is assumed 
small in comparison with unity, we obtain the nonrelativistic 
formula (29.41). 

Equation (37.43) agrees with the result of Exercise 9, Section 14, 
in a remarkable way. If in the classical formula we replace the action 
variables in terms of Bohr’s quantum conditions (31.42), we obtain 
Eq. (37.43), which was developed in this way by Sommerfeld without 
taking electron spin into account. But in that case, of course, the 
number of atomic states comes out wrong. 

Radiation Corrections. It follows from Eq. (37.43) that the energies 
of states 2 s l/2 and 2 p if2 must be the same, since they correspond to 
the same n and j. Actually, though, they vary in terms of frequency 
by 1043 megacycles per second. 

The energy difference is due to the fact that in developing the 
fine structure formula (37.43) the effect of zero-point oscillations of 
the electromagnetic-field vacuum was not taken into account. These 
oscillations differently affect the electron in the s and p states and 
therefore split the degenerate energy level. 

Besides zero-point oscillations, a certain contribution to the 
splitting is made by the already mentioned polarization of vacuum 
due to the production and annihilation of electron-positron pairs. 
Since the polarization of vacuum occurs mainly at small distances 
from the nucleus (of the order of h/(mc)), while the electron in a 
hydrogen atom is much farther away, at a distance of the order of 
one atomic unit, that is h 2 /(me 2 ), the polarization contribution to 
the splitting of the energy level is relatively small: around three 
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per cent of the total effect. Nevertheless, the precision of experimen¬ 
tal data is so great that the reality of the effect of vacuum polariza¬ 
tion has been fully confirmed. 

The corrections to the formulas due to zero-point oscillations or 
pair production and annihilation are known as radiation corrections. 

In calculating them diverging integrals always appear. Therefore 
the following procedure is adopted. The energy shift of a free electron 
due, say, to zero-point oscillations is calculated, together with the 
same shift of an electron bound in an atom. Both corrections lead to 
diverging integrals, but their difference can be determined, because 
it is finite; moreover, this process is quite unambiguous. 

The meaning of this subtraction consists in the following. Physi¬ 
cally the electron is inseparable from its charge, that is from the 
radiation field. When we speak of a “free” electron, we actually have 
in mind that the electron interacts with the radiation field which, 
as was shown, cannot be assumed zero even in the ground state. 
Thus, by subtracting the energy corrections for a free electron from 
the energy corrections for a bound electron, we thereby simply re¬ 
define the concept of a free electron. 

Ultimately a small and finite shift is obtained, the relative small¬ 
ness of which is due to the fact that the fine structure constant , 
e 2 l(hc ), is small in comparison with unity. 

Similarly, we are able to find the correction to the magneto¬ 
mechanical ratio of the electron, that is ehl(2mc). It agrees with 
experiments in the next two approximations with respect to e 2 /(hc). 

Thus, quantum electrodynamics meets the basic requirements 
of any physical theory: it can precalculate any observable effect, 
to any degree of accuracy, uniquely, and intrinsically unambig¬ 
uously, in full agreement with experiments. 

The most important unsolved problem of quantum electro¬ 
dynamics consists in the theoretical development of the dimension¬ 
less quantity 1/137. So far, however, we do not even know if this 
is at all possible in the framework of electrodynamics alone, without 
introducing other fields besides the electromagnetic. 


EXERCISES 

1. Prove that a quantum cannot give rise to an electron-positron pair 
in free space in the absence of an additional external field. 

Solution. The conservation laws in the absence of a field are written 
thus: 

, had 

pH—— n = p 4 


— (^i 2 c 4 + c a p*) 1 -f ^ 0 ) = (m 2 c 4 + c 2 pf) 1/2 , 
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Here, p is the electron momentum in a negative energy state, n is the unit 
vector in the direction of the quantum momentum, pj is the electron momen¬ 
tum in a positive energy state. Substituting p x in the first equality and squar¬ 
ing the left- and right-hand sides, it is easy to see that this equation is not 
satisfied. 

Another method of proof is based on simple reasoning. A transition to 
another inertial reference frame can always make the energy of a quantum 
less than 2me 2 . A quantum cannot give rise to a pair in such a reference 
frame, simply because it has insufficient energy. But what is impossible 
in one reference frame is impossible in all reference frames, because the 
possibility or impossibility of an event does not depend upon the choice of 
reference frame. 

The preceding argument no longer holds if pair production is considered 
close to a nucleus. Here, the nucleus is at rest in one reference frame and in 
motion in another. Where the energy of the quantum is less than 2me 2 , the 
moving nucleus will “help” it produce a pair. Naturally, pair production is 
impossible if the energy of the electron in the rest frame of the nucleus is 
less than 2 me*. 


2. Obtain the solution to the Dirac equation for a free electron. 

Solution. Equate % to zero. Then the first equation of (37.11) is satis¬ 
fied if we take = Ac (p x — ip y ) and i|) 4 = — Acp z . Here a and f are 
determined by (37.106), and not by (37.10a). The second equation of (37.11) 
gives 


Ac i (px+Pv+ph 


A ( E *— m 2 c 4 ) 
E — me 2 


= A ( E+mc 2 ) 


The third equation of (37.11) reduces to the identity 
(E + me*) -i|) 3 = Ac (E + me*) (p x — ip y ) 

= c(p x — ip y ) \|) 2 = Ac ( p x — ip y ) (E + me*) 

The fourth equation also reduces to an identity. The number A is determined 
from the normalization condition | ^ \* + | l a + I ^3 I 2 + 1| 2 = 1 , 
or A = [2 E (E + me 2 )]" 1 / 2 . 

The components ^3 and \|) 4 are small compared with %, \|) a if v c. 
Therefore the solution corresponds to positive energy. Another solution 
with positive energy is obtained if we take \|) 2 = 0. Negative energy solu¬ 
tions are obtained if we choose \|) 3 = 0 or ^ = 0. 

3. Show that from the Dirac equation there follows a charge-conserva¬ 
tion equation which is analogous to (23.15): 

a 

—- | |* = — div (\|;*ca\J)) 


where 1 I 2 = I % I 2 + I ^2 I 2 + I ^3 I 2 + I ^4 I 2 - 

Hint. Write Eq. (37.11) and its complex conjugate; multiply the first 
by -i|)* and the second by -i|r, subtract the second from the first and utilize 
the Hermiticity of a and p. 

1/2 35—0452 
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4. Show that if is a solution with positive energy E , then (T 2 \|) is 
a solution with negative energy — E. 

Solution. The equation for \|) is 

= c (a • p) + mc 2 Pi|) 

Whence 

Ep 2 't> = c P2 (a-P) *J) + mc 2 p 2 P\l)= — lc (o-p)+ mc 2 P] p 2 \|) 

This proves that a negative-energy solution cannot be avoided. 

5. Prove that the operators 

h ~ h ~ ~ h ~ h ~ ~ h ~ h ~ ~ 

T 0 *—2 7 a « az ’ 2~° y — 2i “ za *’ ~2° z = 2l “*“» 

operating on four-component functions are spin moment operators. 
Solution. Since 

h* 6 2 ~- fc 2 h 2 

— CT *=- 4 - = — “y = — 

•V 2 ^ - h 2 - - ~ - h 2 ~ ~ h - 

_ a x a y =-4- a y a z a z a x =-= i -^ a z 

6 2 - - . A - 

— — — 1 -y- a z 

the spin operators determined here possess all the required properties (see 
Sec. 30). This can also be seen from the definition (37.106) of a in terms of a 
and p. We notice that the spin operators do not interchange both functions 
of the first pair (ip l7 i|) 2 ) and both functions of the second pair (ty 3 , \|) 4 ) but 
make interchanges only inside each pair. 

6 . Show that according to the Dirac equation only the sum of the 
orbital and spin angular momenta and not each angular momentum sepa¬ 
rately satisfies the angular-momentum conservation law. 

Solution. The total angular momentum is defined as 

i=l + 8 = rXp+j# 

~ ~ ~ ~ h ~ - 

h = h + ‘z = *Py ~yPx+ 2 i a x a y 

W© calculate the commutator with the Hamiltonian: 

= t c (a x Px + a y Py-r &zPz) + P"ic 2 ] ( xp y -yp x + -^r a x a y ) 

— ( xp y — yp x +-^r a x a y ) [c (a x p x + a y p y + a z p z ) + f5mc 2 ] 

— C<X x p y [p x X xp x ) CCt y P x ( p y y yPy) 

+ -^fPx — a.xa y a x ) +-£r Py (a„a x a„ — a x a y a x ) 

— ~~r~ i&xPjj a i fPx "f - Px&y Py a a) = 0 
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The Hamiltonian commutes also with the square of the total angular momen¬ 
tum ? 2 = + ]y i%, which is proved similarly. The integrals of motion 

are y' 2 and y z , and not Z 2 , l z and s 2 , s z separately. 

7. Show that the Dirac equation is invariant with respect to the sub¬ 
stitution of — t for t , that is, to the time reversal operation (the T operator). 
Solution . Replacement of t by — t corresponds to a transformation to 

the complex conjugate equation \|)*(—pt — ap — fimc 2 ) = 0. We must 
find the transformation that reverses it to the initial form of the Dirac’s 
equation. We pass to the transposed operators and get 

(— Pt — a p — Pmc 2 ) = 0 

(see (37.206)). The required transformation must satisfy the conditions 
Ta= — aT, = — ftf 7 

or 

Ta x = — a x T, Ta y = — a y T, Ta z = — a z T, T$=—§T 

From this, T = p lT d y = a y . Consequently CPT = i', that is, it does not 
alter the Dirac equation. 


SUPPLEMENTARY EXERCISES 

Sec. 1. (a) What is the unit of mass in grams, taking as the standard 
the gravitational constant G in the gravitation law F = Gm 1 m 2 /r 2 , the 
centimetre, and the second? 

(b) What path relative to the earth is described by a person walking 
uniformly along the radius of a uniformly rotating platform? 

Sec. 2. (a) How many degrees of freedom do the following objects have: 
a pair of scissors? A bicycle (not counting the chain)? A meat grinder attached 
to a table? 

(b) Offer an example of motion of a system in conditions having the 
same symmetry as a sphere or a cylinder. 

Sec. 3. Write the Lagrange equations for a system with the Lagrangian 

L (9i + 9i9l + 9i sin 2 92 $!) — ^ («l) 

Sec. 4. (a) The proton-neutron binding energy in a deuterium nucleus 
is 2.2 MeV. To what energy must a proton be accelerated to smash a deute¬ 
rium nucleus? Assume the masses of the proton and the neutron to be equal 

(b) What is the energy distribution between shell and gun at the time 
of firing? 

Sec. 5. (a) What is the ratio of the angular momenta of two planets of 
equal mass orbiting the sun along circular orbits of radii rj and r 2 ? 


35 * 
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(b) Express the major axis of an elliptical orbit in terms of the energy 
of the planet. 

(c) Express the period of rotation of a planet in terms of its energy, 
making use of the fact that for e < 1 

2n 

j (1 —(- e cos cp)~ 2 = 2n (1 — e 2 ) ~ 3 ^ 2 

0 

Sec. 6. (a) Determine the mean energy transferred from a colliding 
particle to an identical particle initially at rest, for the case of isotropic 
scattering in the centre-of-mass reference frame. 

(b) Determine the maximum angle at which a particle of mass 3m 
may be scattered on a resting particle of mass m. 

(c) A particle of mass mj impinges on a particle of mass m 2 , which is 
at rest, and adheres to it. Express the kinetic energy of the combined particle 
in terms of the kinetic energy of the impinging particle. 

Sec. 7. (a) Consider the oscillations of a particle with one degree of 
freedom in the presence of an elastic force —mcoja; and the force friction 
directed against the velocity and proportional to its magnitude. 

Solution. From the condition, the equation of motion is 


mx = — mco*# — ax 

The solution has the form x = Re (Ce~ l0it ), where C and co are complex 
quantities. For co we obtain the quadratic equation mco 2 — mcog + ia co = 
= 0, whence 


ia / 
~2m ( £ 


q2 \ 1/2 
4m 2 / 


If co e > a/(2m), the square root is real, and the solution has the form of 
damped oscillations: 

r, { c « P [ ± , („i- £) 1,2 j } 

If the friction is great, the root is imaginary and 



The solution is aperiodic and falls off exponentially: 

* = e” a</(2m) { Ci exp [ - (- ■Mj ) :1/2 1 ] 

+ ^ e xp[(^--a)*.) 1 / 2 i]} 

(b) Consider the motion of a particle in the presence of an elastic force 
and .an external driving force Re (/e _1(0t ), where / = | / | e~ 1 ^. 
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Answer. 

. x -UDt v 

^ Re (^F^))+ Re ^ _i “ 0< ) 

(c) Ditto, for the case © = © 0 (resonance). 

Answer. The term due to the external force has the form 

*—(^) 

The oscillation amplitude increases linearly with time. 

(d) Solve problem (b) on the assumption that in addition to the driving 

force there is a frictional force —ax. 

Answer. The term due to the external force is 


j=Re ( 

\ m (o)g — or) — taco / 


| /1 cos (0)t + £—t) 
[m 2 (coj — co 2 ) 2 + a 2 a) 2 ] 1;2 


== arc tan • 


a© 


m (cog — cd 3 ) 

The oscillation amplitude remains finite at all frequency values of the 
driving force. 

The phase difference between the oscillations of the driving force 
and the mass point is for three different cases thus: 

(i) cd < 0) 0 , aco/m < ©$, = 0. The particle follows the driving 

force, coming in step with its phase. 

(ii) © ©o, aim ©, = —ji. Owing to inertia the particle oscil¬ 

lates in the opposite phase with respect to the driving force. 

(iii) © = ©o (resonance), = —ji/ 2. The phase of the particle is shifted 
by — Jt/2 with respect to the phase of the driving force. 

(e) Consider the oscillations of an elastically bound particle in the 
presence of “dry” friction F directed always against the velocity, and in 
magnitude independent of the velocity. 

Solution . For the first half-period, assuming the velocity during it 

to be positive, we have the equation of motion mx + m(o\x = —F. Mul¬ 
tiply this equation by x and integrate from certain initial values of the 


coordinate and velocity, x 0 and x 0 . The integral of the motion reduces to 
the form 



This is the equation of a semicircle in the plane (x, x/(o 0 ) centred on 


x = —F/(/n©g), x = 0. The semicircle terminates at the intersection with 
the x axis, where the velocity reverses its sign. If at that point x F/(m©$), 
the motion ceases, since the elastic force is no longer able to overcome the 
force of friction, which from that instant is in the opposite direction. If 
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x > Fl(ma>%) at x = 0, the motion continues until the rest point occurs 
within the interval — F/(mco§) < x < Fl(m(D$). 

This phenomenon is called stagnation. It lowers the accuracy of instru¬ 
ments in which dry friction has not been eliminated. 

Sec. 8. (a) At what latitude will the oscillation plane of a Foucault 
pendulum describe a complete circle in 48 hours? 

(b) Would the oscillation plane of a pendulum rotate on the moon? 

Sec. 9. (a) Determine the principal axes of a molecule of HDO, assum¬ 
ing the bond lengths of HO and DO to be equal and at an angle of 108° 
(D is deuterium). 

(b) Determine the frequency of small oscillations of a homogeneous 
elliptical cylinder with semiaxes a and b lying in the horizontal plane. 

Sec. 10. (a) Find the Hamiltonian if the Lagrangian is given in the 
problem to Section 3, and write the corresponding Hamilton equations. 

(b) Write the Hamilton-Jacobi equation for the foregoing exercise 
and separate the variables. 

(c) Write the expressions for the adiabatic invariants of a free sym¬ 
metric top. 

(d) Find the path of motion in Kepler’s problem, assuming that the 
momentum projections are plotted along the coordinate axes. 

Hint. As in the case of (10.13), determine the transformation function 
F"(p, P)j and write the Hamilton-Jacobi equation for it. Introduce the 
square of the momentum and its direction in the plane of motion as the 
dependent variables. 

Sec. 11. (a) Write the components of curl of a certain vector in cylin¬ 
drical coordinates. 

(b) Compare the expressions for the Laplacians of a scalar and a vector 
in cylindrical coordinates. 

Sec. 12. (a) Write Maxwell’s equations and the equations for the potenti¬ 
als in spherical coordinates. 

Sec. 13. (a) A particle travelling with a velocity v x = 0.95c emits another 
particle, which travels relative to it with a velocity v 2 = 0.99c at an angle 
of 30° in a reference frame in which the emitting particle rests. Find the 
magnitude and direction of the emitted particle with respect to the initial 
(fixed) frame of reference. 

(b) A particle travelling with a velocity v = 0.999c disintegrate into 
two particles of zero mass. Determine the angle between the directions of 
emission of the photons in the laboratory frame of reference, if in the refer¬ 
ence frame fixed with respect to the disintegrating particle one of the zero- 
mass particles is travelling at a 30° angle to the direction of the velocity 
of the disintegrating particle. 

Sec. 14. (a) Through what potential difference must an initially sta¬ 
tionary electron pass so as to attain a velocity of 0.999c? 
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(b) Determine the motion of a charge in constant and homogeneous 
electric and magnetic fields whose directions coincide. 

(c) Ditto, in perpendicular fields. 

(d) Show that the path of a charge in a constant and homogeneous 
magnetic field is of similar form in coordinate and momentum space (that 
is, a space in which the momentum components are plotted along the axes), 
and is not determined by the form of the energy-momentum dependence. 

Hint. Compare with exercise (d) to Section 10. 

Sec. 15. (a) A constant magnetic field acts parallel to the plates of a 
charged plane capacitor. Explain why no energy flux develops, although 
formally the Poynting vector is not zero. 

Hint. Find the divergence of the Poynting vector. 

(b) Using the relativistically invariant notation of the equations of 
motion of a charge in an electromagnetic field (14.33), show that in a con- 

4 

stant, homogeneous field the solution has the form x t = ^ % 

n=l 

where the quantities x (n > are expressed in terms of the invariants of the 
electromagnetic field tensor. 

(c) Show that the invariants of the tensor F\k = coincide 

with the invariants of the tensor F ik . 

Sec. 16. (a) A quadrupole is formed by four charges, 1, —1, 2, —2, 
located at the apexes of a parallelogram with sides 1 and 2 forming an angle 
45°. Find the principal axes of the tensor of the quadrupole moment and 
the value of the moment relative to the principal axes. 

(b) The principal moments of inertia of an ellipsoid of rotation are 
I ly / 2 = I 3 . Determine, in the quadrupole approximation, the components 
of the potential of the force of gravity on the symmetry axis and in the 
median plane perpendicular to it. 

(c) A diatomic molecule with moments of inertia J x = 0, I 2 = 1 3 , 
and dipole moment d is placed in a constant, homogeneous electric field. 
Write its equations of motion and reduce to quadratures (the integral can¬ 
not be expressed in elementary functions). 

Sec. 17. A symmetric molecule with principal moments of inertia I lt 
/ 2 = I 3 and with the magnetic moment rigidly connected with the direction 
of the first principal axis of inertia, is placed in a constant magnetic field. 
Show that its motion is similar to the motion of a symmetric top in a gravi¬ 
tational field. 

Sec. 18. (a) The electric field components of a plane wave travelling 
along the x axis are: E y = E x cos p ± E 2 sin p, E z = Ei sin p — E 2 cos p. 
Using (18.36), find the components of the complex vector F and show that 
(i) the relation (18.38) is identically satisfied, (ii) F{ + F\ — E\ + E\, 
and (iii) the absolute magnitude of vectors F x and F 2 and the angle between 
them do not depend upon angle p. 
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(b) At the initial time, the E y component of the electric field of a plane 
electromagnetic wave travelling along the x axis was given by the function 
/ (a;), and its time derivative by the function g (x). Write the expression 
for E y at an arbitrary instant. 

(c) Compare the relations (18.21), (18.22), and (18.31) with the for¬ 
mula expressing the energy of a zero-mass particle in terms of its momentum. 

Sec. 19. (a) A plane monochromatic wave of frequency © impinges nor¬ 
mal on a screen with an aperture of radius a. Determine the approximate size 
of the illuminated region on another screen parallel to the first at a distance 
d > a beyond the aperture, and find the conditions at which the diffraction 
region is considerably larger than that obtained on the basis of geometrical 
optics, that is, from a construction of rectilinear rays. 

Sec. 20. (a) A plane circularly-polarized wave impinges on an electron. 
Find the elliptical polarization of the wave scattered at angle 0 to the initial 
direction. 

(b) Taking into account that the resultant momentum of the electro¬ 
magnetic field radiated by a dipole is zero, show that a charge at rest in the 
field of a plane electromagnetic wave is subject to a force (2/3) e 2 \ E | 2 /(m*c 4 ) 
in the direction of propagation of the wave. 

(c) A charge at rest is subject to the action of the electromagnetic 
field of a travelling electromagnetic wave varying according to the law E = 
= E o sin tot. Show that oscillations of double frequency take place in the 
perpendicular direction, and determine the energy radiated by the oscilla¬ 
tions per unit time. 

Sec. 21. (a) Assuming the phase velocity u to be a function of one Car¬ 
tesian coordinate x , and using (21.7) as the Hamilton-Jacobi equation, find 
the equation of a light beam of given frequency co. 

(b) Find the relation between the phase velocity, group velocity, and 
velocity of light in vacuum for the case when © = c (fcj + & 2 ) 1 /*. 

Sec. 22. Consider the following thought experiment offered to “refute” 
the uncertainty principle. We have a screen with two apertures, one above 
the other. Particles are passed through the apertures one by one. The diffrac¬ 
tion pattern produced by them on another screen is such as though each par¬ 
ticle had passed through both apertures in the first screen. At the same time, 
the vertical component of the momentum received by the second screen on 
impact of a particle is measured. The assertion is made that if this component 
is directed upwards, the particle must have passed through the lower aperture 
in the first screen, and vice versa. This assertion is incompatible with the 
uncertainty principle. Show the error in the reasoning. 

Hint. Apply the uncertainty principle to the second screen as a uni¬ 
versal principle and show that in measuring the momentum the uncertainty 
in the coordinate is equal to the width of the diffraction band. 
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Sec. 23. (a) The wave function of a particle has the form 

oo 

*(*, 0= J C(p)e-^ Et - px)/h dp 

— OO 

where the amplitude C (p) is assumed to be other than zero only close to 
a certain p = p 0 , and E = p 2 l(2m). This wave function describes a so-called 
wave packet. Find the propagation velocity of the maximum value of the 
amplitude of the wave function, or in other words, the velocity of the wave 
packet. 

(b) In the previous exercise, taking C (p) = e~ where 
p 0 > Ap, show that the minimum spatial width of the wave packet satisfies 
the inequality Sx > [htl(2m)] 1 / 2 . Here, the width is determined from 

| AiJ? I 2 — ^" (X "^ )2/[2(A3C)2] 

according to the amplitude of the wave function. 

Sec. 24. (a) Develop the spherical functions YJ, Y§, YJ, YJ, Y \, Y\ f 
YJ, Y§, Yf, and verify their orthogonality. 

(b) Find the commutator of M% and M\. 

(c) Find the commutator of p x and 1 It. 

Sec. 25. (a) Compute (cos 2 ft) in states with wave functions YJ, YJ, 
which must first be normalized. 

(b) The azimuthal dependence of the wave function of a particle is 
other than zero and constant for 0 <p ^ ji, and equal to zero for n < q) < 
< 2a. Expand it in a set of eigenfunctions of the orbital angular momentum 
projection and determine the probability of a certain eigenvalue of the 
angular momentum projection. 

(c) Let ft, <p be the polar angle and azimuth of a point with respect to 
axis z, and 0 and % be the same with respect to axis x. Using the fact that 
cos ft = sin 0 cos cos 0 = sin ft sin <p, expand the spherical functions 

¥) i n the function Yj^©, %). 

Sec. 26. (a) Express e _r2/(2ro) (where r 0 is a constant quantity) in the 
momentum representation. 

(b) Show that the operator A of a finite shift, whose operation on a 
function is defined as Aty (x) --- ^ (x -f- a), has in momentum representation 
the form A = e ?ap * /h . 

Sec. 27. The probability of the appearance of a certain eigenvalue of 

— BE • 

energy E n in a system is w n = e 71 , where P > 0. Find the mean value 
of the quantity X in coordinate representation. 

Sec. 28. (a) Find the energy spectrum of a particle in a potential “tun¬ 
nel” of infinite length and constant rectangular cross section. The walls are 
impermeable for the particle. 
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(b) Find the energy spectrum of a charged particle in a constant, homo¬ 
geneous magnetic field. Represent the vector potential of the field in the 
form A x = 0, A y = H x , A z = 0 (the Landau gauge). 

Sec. 29. (a) Consider the one-dimensional motion of a particle in a 
potential field of the form 

U = D( i + e~ 2ax -2 e - ax ), where -oo<;r<oo 

and show that the discrete energy spectrum is given by the formula 

„ u ( D \ U2 ( , 1 \ , h * a2 ( , 1 \ 2 

En - ha \2^) ( n + *2 “)+^r + -2 -) 

with n limited from above. Plot the result on a graph. 

Hint. Substitute e~ ax = y,[ \|? = fly 1 / 2 and compare with Eq. (29.32), 
then make use of Eq. (29.37). 

(b) Consider the motion of a particle in the field of an attracting centre 
the potential of which is U = — air 2 . Show that for small values of l there 
is no solution that near zero is represented by a real and positive power of r, 
which in classical mechanics corresponds to falling onto a centre. 

(c) What are the values of angular momenta obtained in the addition 
of three angular momenta respectively equal to 1, 2, and 3? 

Sec. 30. (a) Given an operator t = 3 (^-n) (cr 2 -n) — (cTfC^)* where 
0 ! and a 2 are the Pauli operators for two particles, and n is a unit vector 
along the line joining the particles. Show that the eigenvalues of t are equal 
to —4, 0, and 2, the latter being two-fold degenerate. 

(b) Explain why the most general form of the density matrix in the 
spin-variable space of one particle of half-integral spin is 

p = a + 6(n-0) 

where n is a unit vector. Find (a x ). 

Sec. 31. (a) Determine, in the quasi-classical approximation, the energy 
levels in a potential field U (x) = D (1 + e~ 2ax — 2e~ ax ). Explain the 
obtained result with the help of a mathematical analog with Kepler’s prob¬ 
lem (compare^ with problem (a) to Section 29). 

(b) Show that in the quasi-classical approximation Kepler’s problem 
yields correct values of the energy levels if the angular momentum square 
is taken to be (l + 1/2) 2 rather than l (l + 1). 

(c) In the quasi-classical approximation, find the probability of a par¬ 
ticle of energy E penetrating a potential barrier of the form U = U 0 — ai 2 , 

E < U Q , a > 0. 

Sec. 32. (a) A particle is placed in a spherical potential well, such that 
U = — | U Q | for r < r 0 , and U = 0 for r > r 0 . The depth of the well 
varies by a constant quantity U — (U 0 + 6 U Q ). Consider the change 
in the energy eigenvalue in the first approximation of perturbation theory 
and compare with the exact formula for the energy of bound states. 
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(b) The nucleus of a deuterium atom possesses quadrupole moment q. 
Find the splitting of the levels of the 2 p state of the electron. 

(c) A system has two energy levels the separation of which is E\ — E 0 = 
= h(D 0 . It is subject to a perturbation V (t, x) which depends upon time 

+ oo 

according to the law V ( t , j*) = ^ V (co, x) cos co* dm. At the initial time 

— oo 

the amplitude of the zero state c 0 = 1, so that the amplitude of the first 
state c t = 0. Determine the amplitude of the first state after a sufficiently 
long time. 

Hint. A “sufficiently long time” is a time interval for which we can 
use the relationship 

7 e i (G>-(Do) 

\ / (<*>) —-— -= ni i ( w o) 

J CO—co 0 

0 

which is developed similarly to formula (32.39). 

Sec. 33. (a) What power of the atomic number is the mean distance of 
the electron from the nucleus calculated by the Thomas-Fermi method pro¬ 
portional to? 

(b) Using only parity considerations, obtain the selection rule for 
matrix elements of the coordinate with respect to the quantum number l r 
similar to the way it was developed for the vector y with respect to J (Exer¬ 
cise 4, Section 33). 

Sec. 34. What are the rotational states of an oxygen molecule OJ 3 and 
its heavy isotope 0£ 7 ? (The nuclear spin of O 17 is determined by one neutron 
above the occupied shells.) 

Sec. 35. (a) In the Born approximation, find the differential cross section 
of elastic scattering of electrically charged particles of dipole moment d. 

Hint. Use the limiting transition from a system of two charges at finite 
distance from each other to a dipole. 

(b) Find the partial scattering cross section of a particle in the field 
of a repulsive centre of potential U = a lr 2 if it is known that the asymptotic 
solution of the equation y* + (2%!x) y + k 2 y = 0 is finite for x = 0, and 

that for x ->■ oo, y = x~^ cos (kx — xcA/2). 

Sec. 36. (a) Taking the quantized motion of an oscillating dipole, cal¬ 
culate the radiation intensity in a transition from the first excited state 
to the ground state, assuming the dipole moment to be proportional to the 
oscillator coordinate. Compare with the corresponding classical formula. 

(b) Calculate the radiation intensity of a hydrogen atom in a transition 
from the 2 p to the is state. 

(c) What is the multipole order of the transitions 

4 Sf/2 — 4 ^/2. vf. 2 P m — 2 P ll2 ? 
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Sec. 37. (a) Separate the variables in the Dirac equation for a central 
field, assuming the first two functions to be dependent only upon the radius. 

Hint. Choose the first pair of wave functions in the form 0, if 2 and if^ 0; 
seek the angular dependence of the second pair from the form of the spherical 
functions for l = 1. 

(b) Show that if the Dirac wave function is subjected to the transform¬ 
ation if' — C m \f = (l/\/~2) (a 7 + P) if, the Dirac equation will involve 
only real coefficients. All the a, (} matrices are assumed to have been selected 
in accordance with the equations of Section 37. 

(c) Show that the wave functions of an electron and a positron are of 
opposite parity. 

(d) On the basis of the result of the preceding problem, show that an 
electron-positron system in a state with total spin 1 can annihilate only by 
disintegrating into three quanta. 

Hint. Take into account that an electron-positron interchange leaves 
the Hamiltonian invariant only if the sign of the amplitude of the electro¬ 
magnetic field is simultaneously changed. 
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Spin-orbit interaction, 465, 466 
Spin variable, 391 
Spontaneous emission, 517 
Stark effect, 472 
State, degenerate, 366 
electron, 438 
excited, 344 

ground, 344, 361, 376, 515 
negative, 484 
ortho-, 463 
para-, 463 
parity of, 377 
positive, 484 
stationary, 292, 293 
weight of, 517 
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inversion of, 196 
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Velocity, addition of, 166 
areal, 50 
generalized, 20 
group, 244 
phase, 244 

Vibrational energy, 488 
Vibrational quantum number, 488 
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